Amazon Web Services (AWS) has expanded the capabilities of its Amazon SageMaker machine learning toolkit to address a number of challenges that enterprises confront when trying to operationalise machine learning, from model organisation, training, and optimisation to monitoring the performance of models in production.
Launched at the Amazon’s re:invent conference in 2017, SageMaker aims to make machine learning adoption simpler for customers by bringing together a hosted environment for Jupyter notebooks with built-in model management, automated spin up of training environments in Amazon S3, and HTTPS endpoints for hosting capabilities using EC2 instances.
As CEO Andy Jassy presents it, AWS—like rivals Google Cloud and Microsoft Azure—wants to become the leading, full-service environment for data scientists, data engineers, and non-specialist developers to run all of their machine learning workloads.
For AWS this means a triple-layered stack of services, starting with the basic building blocks used by experienced technical practitioners who want to be able to tweak every part of their modelling process, whether with TensorFlow, PyTorch, MXNet, or another machine learning framework.
SageMaker promises to simplify key elements of the process, topped off with cognitive off-the-shelf services like Translate, Transcribe, image recognition, and voice recognition capabilities.
Introducing SageMaker Studio
Now Amazon is expanding this sandbox with what it calls SageMaker Studio, finally giving customers a fully integrated development environment (IDE) to store and collect all of the source code, notebooks, documentation, data sets, and project folders needed to run and manage machine learning models at enterprise scale, including collaboration capabilities.
Many of these capabilities can already be found within Microsoft’s Azure Machine Learning platform and Google Cloud’s AI Hub, while data science “workbench” offerings are also provided by the likes of Domino Data Lab and Dataiku.
SageMaker Experiments and Model Monitor
Among the new capabilities AWS has announced, let’s start with notebooks. AWS wants to simplify the provisioning of compute when spinning up a Jupyter notebook with one click, as well as automating the tricky process of transferring contents between notebooks.
Next on the list of announcements was SageMaker Experiments, a new feature which allows developers to view and manage all of the different iterations of their models.
It does this by collecting key metrics like input parameters, configuration, and output results so that users can compare and contrast the performance of multiple models, both new models and older experiments.
Amazon has also added a native debugging tool, allowing users to debug and profile models during training, a process that has traditionally proved opaque. The debugger will flag when models are deviating from accuracy and performance indicators complete and offer remediation advice.
Lastly Amazon also announced SageMaker Model Monitor, which helps customers better detect “concept drift,” where the data being used by a model in production starts to deviate from that used to train the model.
With SageMaker Model Monitor, AWS customers will be alerted when deviations in the data may be occurring based on a baseline level they configure by feeding a sample of their data to SageMaker. Model Monitor will then inspect data and prediction quality on a set schedule, even providing per-feature metrics to Amazon CloudWatch.
As Nick McQuire, vice president of enterprise research at CCS Insight said, “Customers are now doubling down on tackling data drift, black box AI, and requiring more tools to help them track model behaviour in production. AWS has had to finally bring these areas into focus but in my view, they are a bit late to the party.
"Model explainability, bias detection, and performance monitoring have been glaring omissions in its strategy this year against Microsoft and Google in particular."
SageMaker Autopilot for automated machine learning
Amazon also announced some changes to its automated machine learning, or AutoML, offering (not to be confused with Google Cloud’s own AutoML product), which automates the selection, training, and optimisation of machine learning models within Sagemaker for classification and linear regression models.
Jassy said that customers have asked for greater visibility into these models, and has responded with SageMaker Autopilot.
The rough end-to-end workflow with SageMaker Autopilot is that customers provide the CSV file or a link to the S3 location of data they want to build the model on, and SageMaker will then train up to 50 different models on that data and give customers access to each of these as notebooks and present them in the form of a leaderboard within SageMaker Studio.
The entire process, from data cleaning and pre-processing to algorithm choice to instance and cluster size selection, is handled automatically.
“So when you open the notebook the recipe of that model is there, from the algorithm to the parameters, so you can evolve it if you want,” Jassy said during his re:Invent keynote today.
In theory this allows companies to level up their models as they go with AWS, starting with classification and regression algorithms, but giving them the ability to track, measure, and customise these as they accumulate more data and grow the data science and engineering skills in their business.
SageMaker Studio is available immediately from the AWS US East (Ohio) region, while SageMaker Experiments and SageMaker Model Monitor are available immediately for all SageMaker customers.