Amazon SageMaker has allowed users to create, train, and deploy their own machine learning models. SageMaker is designed to make machine learning much simpler and easier to manage. In the past, AWS clients have used it to successfully handle Jupyter notebooks and manage distributed training. They have also deployed models to SageMaker hosting for inferences in order to integrate machine learning with their applications.
Now, AWS has announced an easier means to manage production ML models. Amazon SageMaker will now have Auto Scaling, which will automatically scale the number of instances depending on a designated policy.
Previously, users needed to specify the instance type and number of instances at each endpoint to create the scale need for their inferences. Should the inference volume change, you would need to change the number or type of instance, or both, that support each endpoint to match that shift.
Now, Amazon SageMaker makes this process easier. Instead of needing to monitor the inference volume and changing the endpoints, you only need to create a scaling policy for AWS Auto Scaling to use. This will adjust the number of instances as needed depending on the actual workloads, the data of which can be provided by Amazon Cloudwatch metrics and target values defined in the policy. This lowers the cost of adjusting capacity while keeping steady performance.
Pay-as-you-go pricing still applies for the computer power used with AWS SageMaker, meaning you do not have to pay for unused capacity during non-active periods. More info is available at the Amazon SageMaker documentation.
Auto scaling for AWS SageMaker is now available for US East (N. Virginia & Ohio), EU (Ireland), and US West (Oregon).
If you would like to learn how to apply AWS services to your business, kindly contact us here at PolarSeven.