SageMaker JumpStart - Amazon SageMaker

SageMaker JumpStart

Important

To use new features with an existing notebook instance or Studio app, you must restart the notebook instance or the Studio app to get the latest updates.

You can use SageMaker JumpStart to learn about SageMaker features and capabilities through curated 1-click solutions, example notebooks, and pretrained models that you can deploy. You can also fine-tune the models and deploy them.

To access JumpStart, you must first launch SageMaker Studio. JumpStart features are not available in SageMaker notebook instances, and you can't access them through SageMaker APIs or the AWS CLI.

Open JumpStart by using the JumpStart launcher in the Get Started section or by choosing the JumpStart icon ( ) in the left sidebar.

In the file and resource browser (the left pane), you can find JumpStart options. From here you can choose to browse JumpStart for solutions, models, notebooks, and other resources, or you can view your currently launched solutions, endpoints, and training jobs.

To see what JumpStart has to offer, choose the JumpStart icon, and then choose Browse JumpStart. JumpStart opens in a new tab in the main work area. Here you can browse 1-click solutions, models, example notebooks, blogs, and video tutorials.

Important

Amazon SageMaker JumpStart makes certain content available from third-party sources. This content may be subject to separate license terms. You are responsible for reviewing and complying with any applicable license terms and making sure they are acceptable for your use case before downloading or using the content.

Using JumpStart

At the top of the JumpStart page you can use search to look for topics of interest. 

You can find JumpStart resources by using search, or by browsing each category that follows the search panel:

  • Solutions – Launch end-to-end machine learning solutions that tie SageMaker to other AWS services with one click.

  • Text models – Deploy and fine-tune pretrained transformers for various natural language processing use cases.

  • Vision models – Deploy and fine-tune pretrained models for image classification and object detection with one click.

  • SageMaker algorithms – Train and deploy SageMaker built-in Algorithms for various problem types with these example notebooks.

  • Example notebooks – Run example notebooks that use SageMaker features like spot instance training and experiments over a large variety of model types and use cases.

  • Blogs – Read deep dives and solutions from machine learning experts hosted by Amazon.

  • Video tutorials – Watch video tutorials for SageMaker features and machine learning use cases from machine learning experts hosted by Amazon.

Solutions

When you choose a solution, JumpStart shows a description of the solution and a Launch button. When you click Launch, JumpStart creates all of the resources necessary to run the solution, including training and model hosting instances. After JumpStart launches the solution, JumpStart shows an Open Notebook button. You can click the button to use the provided notebooks and explore the solution’s features. As artifacts are generated during launch or after running the provided notebooks, they are listed in the Generated Artifacts table. You can delete individual Artifacts with the Trash icon ( ). You can delete all of the solution’s resources by choosing Delete solution resources.

Models

Models are available for quick deployment directly from JumpStart. You can also fine-tune some of these models. When you browse the models, you can scroll to the deploy and fine-tune sections to the Description section. In the Description section, you can learn more about the model, including what it can do with the model, what kind of inputs and outputs are expected, and the kind of data you need if you want to use transfer learning to fine-tune the model.

The following tables list the models currently offered in JumpStart. The available models are sorted by their model type and task. To view other model sets, click the task tab for those models.

Text Models

Text Classification Models

Model Fine-tunable Source
BERT Base Cased Yes Tensorflow Hub
BERT Base MEDLINE/PubMed Yes Tensorflow Hub
BERT Base Multilingual Cased Yes Tensorflow Hub
BERT Base Uncased Yes Tensorflow Hub
BERT Base WikiPedia and BookCorpus Yes Tensorflow Hub
BERT Large Cased Yes Tensorflow Hub
BERT Large Cased Whole Word Masking Yes Tensorflow Hub
BERT Large Uncased Whole Word Masking Yes Tensorflow Hub
ELECTRA-Base++ Yes Tensorflow Hub
ELECTRA-Small++ Yes Tensorflow Hub
Text Generation Models

Model Fine-tunable Source
DistilGPT 2 No Hugging Face
GPT 2 No Hugging Face
GPT 2 Large No Hugging Face
GPT 2 Medium No Hugging Face
OpenAI GPT No Hugging Face
Extractive Question Answering Models

Model Fine-tunable Source
BERT Base Cased Yes PyTorch Hub
BERT Base Multilingual Cased Yes PyTorch Hub
BERT Base Multilingual Uncased Yes PyTorch Hub
BERT Base Uncased Yes PyTorch Hub
BERT Large Cased Yes PyTorch Hub
BERT Large Cased Whole Word Masking Yes PyTorch Hub
BERT Large Cased Whole Word Masking SQuAD Yes PyTorch Hub
BERT Large Uncased Yes PyTorch Hub
BERT Large Uncased Whole Word Masking Yes PyTorch Hub
BERT Large Uncased Whole Word Masking SQuAD Yes PyTorch Hub
DistilBERT Base Cased Yes PyTorch Hub
DistilBERT Base Multilingual Cased Yes PyTorch Hub
DistilBERT Base Uncased Yes PyTorch Hub
DistilRoBERTa Base Yes PyTorch Hub
RoBERTa Base Yes PyTorch Hub
RoBERTa Base OpenAI Yes PyTorch Hub
RoBERTa Large Yes PyTorch Hub
RoBERTa Large OpenAI Yes PyTorch Hub
Sentence Pair Classification Models

Model Fine-tunable Source
BERT Base Cased Yes Hugging Face
BERT Base Cased Yes Tensorflow Hub
BERT Base MEDLINE/PubMed Yes Tensorflow Hub
BERT Base Multilingual Cased Yes Hugging Face
BERT Base Multilingual Cased Yes Tensorflow Hub
BERT Base Multilingual Uncased Yes Hugging Face
BERT Base Uncased Yes Hugging Face
BERT Base Uncased Yes Tensorflow Hub
BERT Base Wikipedia and BooksCorpus Yes Tensorflow Hub
BERT Large Cased Yes Hugging Face
BERT Large Cased Whole Word Masking Yes Hugging Face
BERT Large Cased Whole Word Masking Yes Tensorflow Hub
BERT Large Uncased Yes Hugging Face
BERT Large Uncased Yes Tensorflow Hub
BERT Large Uncased Whole Word Masking Yes Hugging Face
BERT Large Uncased Whole Word Masking Yes Tensorflow Hub
DistilBERT Base Cased Yes Hugging Face
DistilBERT Base Multilingual Cased Yes Hugging Face
DistilBERT Base Uncased Yes Hugging Face
DistilRoBERTa Base Yes Hugging Face
ELECTRA-Base++ Yes Tensorflow Hub
ELECTRA-Small++ Yes Tensorflow Hub
RoBERTa Base Yes Hugging Face
RoBERTa Base OpenAI Yes Hugging Face
RoBERTa Large Yes Hugging Face
RoBERTa Large OpenAI Yes Hugging Face
XLM CLM English-German Yes Hugging Face
XLM MLM 15 XNLI Languages Yes Hugging Face
XLM MLM English-German Yes Hugging Face
XLM MLM English-Romanian Yes Hugging Face
XLM MLM TLM 15 XNLI Languages Yes Hugging Face

Vision Models

Image Classification Models

Model Fine-tunable Source
AlexNet Yes PyTorch Hub
BiT-M R101x1 Yes Tensorflow Hub
BiT-M R101x1 ImageNet-21k Yes Tensorflow Hub
BiT-M R101x3 Yes Tensorflow Hub
BiT-M R101x3 ImageNet-21k Yes Tensorflow Hub
BiT-M R50x1 Yes Tensorflow Hub
BiT-M R50x1 ImageNet-21k Yes Tensorflow Hub
BiT-M R50x3 Yes Tensorflow Hub
BiT-M R50x3 ImageNet-21k Yes Tensorflow Hub
BiT-S R101x1 Yes Tensorflow Hub
BiT-S R101x3 Yes Tensorflow Hub
BiT-S R50x1 Yes Tensorflow Hub
BiT-S R50x3 Yes Tensorflow Hub
DenseNet 121 Yes PyTorch Hub
DenseNet 161 Yes PyTorch Hub
DenseNet 169 Yes PyTorch Hub
DenseNet 201 Yes PyTorch Hub
EfficientNet B0 Yes Tensorflow Hub
EfficientNet B0 Lite Yes Tensorflow Hub
EfficientNet B1 Yes Tensorflow Hub
EfficientNet B1 Lite Yes Tensorflow Hub
EfficientNet B2 Yes Tensorflow Hub
EfficientNet B2 Lite Yes Tensorflow Hub
EfficientNet B3 Yes Tensorflow Hub
EfficientNet B3 Lite Yes Tensorflow Hub
EfficientNet B4 Yes Tensorflow Hub
EfficientNet B4 Lite Yes Tensorflow Hub
EfficientNet B5 Yes Tensorflow Hub
EfficientNet B6 Yes Tensorflow Hub
EfficientNet B7 Yes Tensorflow Hub
GoogLeNet Yes PyTorch Hub
Inception ResNet V2 Yes Tensorflow Hub
Inception V1 Yes Tensorflow Hub
Inception V2 Yes Tensorflow Hub
Inception V3 Yes Tensorflow Hub
Inception V3 Preview Yes Tensorflow Hub
MobileNet V1 0.25 128 Yes Tensorflow Hub
MobileNet V1 0.25 160 Yes Tensorflow Hub
MobileNet V1 0.25 192 Yes Tensorflow Hub
MobileNet V1 0.25 224 Yes Tensorflow Hub
MobileNet V1 0.50 128 Yes Tensorflow Hub
MobileNet V1 0.50 160 Yes Tensorflow Hub
MobileNet V1 0.50 192 Yes Tensorflow Hub
MobileNet V1 0.50 224 Yes Tensorflow Hub
MobileNet V1 0.75 128 Yes Tensorflow Hub
MobileNet V1 0.75 160 Yes Tensorflow Hub
MobileNet V1 0.75 192 Yes Tensorflow Hub
MobileNet V1 0.75 224 Yes Tensorflow Hub
MobileNet V1 1.00 128 Yes Tensorflow Hub
MobileNet V1 1.00 160 Yes Tensorflow Hub
MobileNet V1 1.00 192 Yes Tensorflow Hub
MobileNet V1 1.00 224 Yes Tensorflow Hub
MobileNet V2 Yes Tensorflow Hub
MobileNet V2 Yes PyTorch Hub
MobileNet V2 0.35 224 Yes Tensorflow Hub
MobileNet V2 0.50 224 Yes Tensorflow Hub
MobileNet V2 0.75 224 Yes Tensorflow Hub
MobileNet V2 1.00 224 Yes Tensorflow Hub
MobileNet V2 1.30 224 Yes Tensorflow Hub
MobileNet V2 1.40 224 Yes Tensorflow Hub
ResNet 101 Yes PyTorch Hub
ResNet 152 Yes PyTorch Hub
ResNet 18 Yes PyTorch Hub
ResNet 34 Yes PyTorch Hub
ResNet 50 Yes Tensorflow Hub
ResNet 50 Yes PyTorch Hub
ResNet V1 101 Yes Tensorflow Hub
ResNet V1 152 Yes Tensorflow Hub
ResNet V1 50 Yes Tensorflow Hub
ResNet V2 101 Yes Tensorflow Hub
ResNet V2 152 Yes Tensorflow Hub
ResNet V2 50 Yes Tensorflow Hub
Resnext 101 Yes PyTorch Hub
Resnext 50 Yes PyTorch Hub
ShuffleNet V2 Yes PyTorch Hub
SqueezeNet 0 Yes PyTorch Hub
SqueezeNet 1 Yes PyTorch Hub
VGG 11 Yes PyTorch Hub
VGG 11-BN Yes PyTorch Hub
VGG-13 Yes PyTorch Hub
VGG 13-BN Yes PyTorch Hub
VGG 16 Yes PyTorch Hub
VGG 16-BN Yes PyTorch Hub
VGG 19 Yes PyTorch Hub
VGG 19-BN Yes PyTorch Hub
Wide ResNet 101 Yes PyTorch Hub
Wide ResNet 50 Yes PyTorch Hub
Image Embedding Models

Model Fine-tunable Source
BiT-M R101x1 Feature Vector No Tensorflow Hub
BiT-M R101x3 ImageNet-21k Feature Vector No Tensorflow Hub
BiT-M R50x1 Feature Vector No Tensorflow Hub
BiT-M R50x3 ImageNet-21k No Tensorflow Hub
BiT-S R101x1 Feature Vector No Tensorflow Hub
BiT-S R101x3 Feature Vector No Tensorflow Hub
BiT-S R50x1 Feature Vector No Tensorflow Hub
BiT-S R50x3 Feature Vector No Tensorflow Hub
EfficientNet B0 Feature Vector No Tensorflow Hub
EfficientNet B0 Lite Feature Vector No Tensorflow Hub
EfficientNet B1 Feature Vector No Tensorflow Hub
EfficientNet B1 Lite Feature Vector No Tensorflow Hub
EfficientNet B2 Feature Vector No Tensorflow Hub
EfficientNet B2 Lite Feature Vector No Tensorflow Hub
EfficientNet B3 Feature Vector No Tensorflow Hub
EfficientNet B3 Lite Feature Vector No Tensorflow Hub
EfficientNet B4 Lite Feature Vector No Tensorflow Hub
EfficientNet B6 Feature Vector No Tensorflow Hub
Inception V1 Feature Vector No Tensorflow Hub
Inception V2 Feature Vector No Tensorflow Hub
Inception V3 Feature Vector No Tensorflow Hub
Inception V3 Preview Feature Vector No Tensorflow Hub
MobileNet V1 0.25 128 Feature Vector No Tensorflow Hub
MobileNet V1 0.25 160 Feature Vector No Tensorflow Hub
MobileNet V1 0.25 192 Feature Vector No Tensorflow Hub
MobileNet V1 0.25 224 Feature Vector No Tensorflow Hub
MobileNet V1 0.50 128 Feature Vector No Tensorflow Hub
MobileNet V1 0.50 160 Feature Vector No Tensorflow Hub
MobileNet V1 0.50 192 Feature Vector No Tensorflow Hub
MobileNet V1 0.50 224 Feature Vector No Tensorflow Hub
MobileNet V1 0.75 128 Feature Vector No Tensorflow Hub
MobileNet V1 0.75 160 Feature Vector No Tensorflow Hub
MobileNet V1 0.75 192 Feature Vector No Tensorflow Hub
MobileNet V1 0.75 224 Feature Vector No Tensorflow Hub
MobileNet V1 1.00 128 Feature Vector No Tensorflow Hub
MobileNet V1 1.00 160 Feature Vector No Tensorflow Hub
MobileNet V1 1.00 192 Feature Vector No Tensorflow Hub
MobileNet V1 1.00 224 Feature Vector No Tensorflow Hub
MobileNet V2 0.35 224 Feature Vector No Tensorflow Hub
MobileNet V2 0.50 224 No Tensorflow Hub
MobileNet V2 0.75 224 Feature Vector No Tensorflow Hub
MobileNet V2 1.00 224 Feature Vector No Tensorflow Hub
MobileNet V2 1.30 224 Feature Vector No Tensorflow Hub
MobileNet V2 1.40 224 Feature Vector No Tensorflow Hub
MobileNet V2 Feature Vector No Tensorflow Hub
ResNet 50 Feature Vector No Tensorflow Hub
ResNet V1 101 Feature Vector No Tensorflow Hub
ResNet V1 152 Feature Vector No Tensorflow Hub
ResNet V1 50 Feature Vector No Tensorflow Hub
ResNet V2 101 Feature Vector No Tensorflow Hub
ResNet V2 152 Feature Vector No Tensorflow Hub
ResNet V2 50 Feature Vector No Tensorflow Hub
Object Detection Models

Model Fine-tunable Source
CenterNet 1024x1024 No Tensorflow Hub
CenterNet 1024x1024 Keypoints No Tensorflow Hub
CenterNet 512x512 No Tensorflow Hub
CenterNet 512x512 Keypoints No Tensorflow Hub
CenterNet ResNet-v1-101 No Tensorflow Hub
CenterNet ResNet-v1-50 No Tensorflow Hub
CenterNet ResNet-v1-50 Keypoints No Tensorflow Hub
CenterNet ResNet-v2-50 No Tensorflow Hub
CenterNet ResNet-v2-50 Keypoints No Tensorflow Hub
Faster R-CNN Resnet V2 1024x1024 No Tensorflow Hub
Faster R-CNN Resnet V2 640x640 No Tensorflow Hub
Faster R-CNN Resnet-101 V1 1024x1024 No Tensorflow Hub
Faster R-CNN Resnet-101 V1 640x640 No Tensorflow Hub
Faster R-CNN Resnet-101 V1 800x1333 No Tensorflow Hub
Faster R-CNN Resnet-152 V1 1024x1024 No Tensorflow Hub
Faster R-CNN Resnet-152 V1 800x1333 No Tensorflow Hub
Faster R-CNN Resnet-152 V1 640x640 No Tensorflow Hub
Faster R-CNN Resnet-50 V1 1024x1024 No Tensorflow Hub
Faster R-CNN Resnet-50 V1 640x640 No Tensorflow Hub
Faster R-CNN Resnet-50 V1 800x1333 No Tensorflow Hub
Faster RCNN ResNet 101 V1d No GluonCV
Faster RCNN ResNet 50 V1b No GluonCV
FRCNN MobileNet V3 large 320 FPN No PyTorch Hub
FRCNN MobileNet V3 large FPN No PyTorch Hub
FRCNN ResNet 50 FPN No PyTorch Hub
Retinanet SSD Resnet-101 1024x1024 No Tensorflow Hub
Retinanet SSD Resnet-101 640x640 No Tensorflow Hub
Retinanet SSD Resnet-152 1024x1024 No Tensorflow Hub
Retinanet SSD Resnet-152 640x640 No Tensorflow Hub
Retinanet SSD Resnet-50 1024x1024 No Tensorflow Hub
Retinanet SSD Resnet-50 640x640 No Tensorflow Hub
SSD No PyTorch Hub
SSD 512 ResNet 50 V1 Yes GluonCV
SSD EfficientDet D0 No Tensorflow Hub
SSD EfficientDet D1 No Tensorflow Hub
SSD EfficientDet D2 No Tensorflow Hub
SSD EfficientDet D3 No Tensorflow Hub
SSD EfficientDet D4 No Tensorflow Hub
SSD EfficientDet D5 No Tensorflow Hub
SSD MobileNet 1.0 Yes GluonCV
SSD Mobilenet V1 640x640 No Tensorflow Hub
SSD Mobilenet V2 No Tensorflow Hub
SSD Mobilenet V2 320x320 No Tensorflow Hub
SSD Mobilenet V2 640x640 No Tensorflow Hub
SSD ResNet 50 V1 Yes GluonCV
SSD VGG 16 Atrous 300 Yes GluonCV
SSD VGG 16 Atrous 512 Yes GluonCV
YOLO V3 DarkNet 53 No GluonCV
YOLO V3 MobileNet 1.0 No GluonCV

Deploy a model

When you deploy a model from JumpStart, SageMaker hosts the model and deploys an endpoint that you can use for inference. JumpStart also provides an example notebook that you can use to access the model after it's deployed.

Model Deployment Configuration

After you choose a model, the Deploy Model pane opens. Choose Deployment Configuration to configure your model deployment.

The default Machine Type for deploying a model depends on the model. The machine type is the hardware that the training job runs on. In the following example, the ml.m5.large instance is the default for this particular BERT model.

You can also change the Endpoint Name.

Fine-Tune a Model

Fine-tuning trains a pretrained model on a new dataset without training from scratch. This process, also known as transfer learning, can produce accurate models with smaller datasets and less training time.

Fine-Tuning Data Source

When you fine-tune a model, you can use the default dataset or choose your own data, which is located in an S3 bucket.

To browse the buckets available to you, choose Find S3 bucket. These buckets are limited by the permissions used to set up your Studio account. You can also specify an S3 URI by choosing Enter S3 bucket location.

Tip

To find out how to format the data in your bucket, choose Learn more. Also, the description section for the model has detailed information about inputs and outputs. 

For text models:

  • The bucket must have a data.csv file.

  • The first column must be a unique integer for the class label. For example: 1, 2, 3, 4, n.

  • The second column must be a string.

  • The second column should have the corresponding text that matches the type and language for the model. 

For vision models:

  • The bucket must have as many subdirectories as the number of classes.

  • Each subdirectory should contain images that belong to that class in .jpg format.

Note

The S3 bucket must be in the same AWS Region where you're running SageMaker Studio because SageMaker doesn't allow cross-region requests.

Fine-Tuning deployment configuration

The p3 family is recommended as the fastest for deep learning training, and this is recommended for fine-tuning a model. The following chart shows the number of GPUs in each instance type. There are other available options that you can choose from, including p2 and g4 instance types.

Instance type GPUs
p3.2xlarge 1
p3.8xlarge 4
p3.16xlarge 8
p3dn.24xlarge 8

Hyperparameters

You can customize the hyperparameters of the training job that are used to fine-tune the model. 

If you use the default dataset for text models without changing the hyperparameters, you get a nearly identical model as a result. For vision models, the default dataset is different from the dataset used to train the pretrained models, so your model is different as a result.

You have the following hyperparameter options:

  • Epochs – One epoch is one cycle through the entire dataset. Multiple intervals complete a batch, and multiple batches eventually complete an epoch. Multiple epochs are run until the accuracy of the model reaches an acceptable level, or in other words, when the error rate drops below an acceptable level.

  • Learning rate – The amount that values should be changed between epochs. As the model is refined, its internal weights are being nudged and error rates are checked to see if the model improves. A typical learning rate is 0.1 or 0.01, where 0.01 is a much smaller adjustment and could cause the training to take a long time to converge, whereas 0.1 is much larger and can cause the training to overshoot. It is one of the primary hyperparameters that you might adjust for training your model. Note that for text models, a much smaller learning rate (5e-5 for BERT) can result in a more accurate model.

  • Batch size – The number of records from the dataset that to be selected for each interval to send to the GPUs available in training. In an image example, you might send out 32 images per GPU, so 32 would be your batch size. If you choose an instance type with more than one GPU, the batch is divided by the number of GPUs. Suggested batch size varies depending on the data and the model that you are using. For example, how you optimize for image data differs from how you handle language data. In the instance type chart in the deployment configuration section, you can see the number of GPUs per instance type. Start with a standard recommended batch size (for example, 32 for a vision model). Then, multiply this by the number of GPUs in the instance type that you selected. For example, if you're using a p3.8xlarge, this would be 32(batch size)*4(GPUs), for a total of 128 as your batch size adjusted for the number of GPUs. For a text model like BERT, try starting with a batch size of 64, and then reduce as needed.

Training Output

When the fine-tuning process is complete, JumpStart provides information about the model: parent model, training job name, training job Amazon Resource Name (ARN), training time, and output path. The output path is where you can find your new model in an S3 bucket. The folder structure uses the model name you provided and the model file is in an /output subfolder and it's always named model.tar.gz

Example: s3://bucket/model-name/output/model.tar.gz