Text Classification - TensorFlow - Amazon SageMaker

Text Classification - TensorFlow

The Amazon SageMaker Text Classification - TensorFlow algorithm is a supervised learning algorithm that supports transfer learning with many pretrained models from the TensorFlow Hub. Use transfer learning to fine-tune one of the available pretrained models on your own dataset, even if a large amount of text data is not available. The text classification algorithm takes a text string as input and outputs a probability for each of the class labels. Training datasets must be in CSV format. This page includes information about Amazon EC2 instance recommendations and sample notebooks for Text Classification - TensorFlow.

Amazon EC2 instance recommendation for the Text Classification - TensorFlow algorithm

The Text Classification - TensorFlow algorithm supports all CPU and GPU instances for training, including:

  • ml.p2.xlarge

  • ml.p2.16xlarge

  • ml.p3.2xlarge

  • ml.p3.16xlarge

  • ml.g4dn.xlarge

  • ml.g4dn.16.xlarge

  • ml.g5.xlarge

  • ml.g5.48xlarge

We recommend GPU instances with more memory for training with large batch sizes. Both CPU (such as M5) and GPU (P2, P3, G4dn, or G5) instances can be used for inference. For a comprehensive list of SageMaker training and inference instances across AWS Regions, see Amazon SageMaker Pricing.

Text Classification - TensorFlow sample notebooks

For more information about how to use the SageMaker Text Classification - TensorFlow algorithm for transfer learning on a custom dataset, see the Introduction to JumpStart - Text Classification notebook.

For instructions how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see Amazon SageMaker Notebook Instances. After you have created a notebook instance and opened it, select the SageMaker Examples tab to see a list of all the SageMaker samples. To open a notebook, choose its Use tab and choose Create copy.