Amazon SageMaker JumpStart Industry: Financial
Use SageMaker JumpStart Industry: Financial solutions, models, and example notebooks to learn about SageMaker features and capabilities through curated one-step solutions and example notebooks of industry-focused machine learning (ML) problems. The notebooks also walk through how to use the SageMaker JumpStart Industry Python SDK to enhance industry text data and fine-tune pretrained models.
Topics
- Amazon SageMaker JumpStart Industry Python SDK
- Amazon SageMaker JumpStart Industry: Financial Solution
- Amazon SageMaker JumpStart Industry: Financial Models
- Amazon SageMaker JumpStart Industry: Financial Example Notebooks
- Amazon SageMaker JumpStart Industry: Financial Blog Posts
- Amazon SageMaker JumpStart Industry: Financial Related Research
- Amazon SageMaker JumpStart Industry: Financial Additional Resources
Amazon SageMaker JumpStart Industry Python SDK
SageMaker Runtime JumpStart provides processing tools for curating industry datasets and
fine-tuning pretrained models through its client library called SageMaker JumpStart Industry
Python SDK. For detailed API documentation of the SDK, and to learn more about
processing and enhancing industry text datasets for improving the performance of
state-of-the-art models on SageMaker JumpStart, see the SageMaker JumpStart
Industry Python SDK open source documentation
Amazon SageMaker JumpStart Industry: Financial Solution
SageMaker JumpStart Industry: Financial provides the following solution notebooks:
-
Corporate Credit Rating Prediction
This SageMaker JumpStart Industry: Financial solution provides a template for a text-enhanced corporate credit rating model. It shows how to take a model based on numeric features (in this case, Altman's famous 5 financial ratios) combined with texts from SEC filings to achieve an improvement in the prediction of credit ratings. In addition to the 5 Altman ratios, you can add more variables as needed or set custom variables. This solution notebook shows how SageMaker JumpStart Industry Python SDK helps process Natural Language Processing (NLP) scoring of texts from SEC filings. Furthermore, the solution demonstrates how to train a model using the enhanced dataset to achieve a best-in-class model, deploy the model to a SageMaker endpoint for production, and receive improved predictions in real time.
-
Graph-Based Credit Scoring
Credit ratings are traditionally generated using models that use financial statement
data and market data, which is tabular only (numeric and categorical). This solution
constructs a network of firms using SEC
filings
Note
The solution notebooks are for demonstration purposes only. They should not be relied on as financial or investment advice.
You can find these financial services solutions through the SageMaker JumpStart page in Studio Classic.
Important
As of November 30, 2023, the previous Amazon SageMaker Studio experience is now named Amazon SageMaker Studio Classic. The following section is specific to using the Studio Classic application. For information about using the updated Studio experience, see Amazon SageMaker Studio.
Note
The SageMaker JumpStart Industry: Financial solutions, model cards, and example
notebooks are hosted and runnable only through SageMaker Studio Classic. Log in to the SageMaker console
Amazon SageMaker JumpStart Industry: Financial Models
SageMaker JumpStart Industry: Financial provides the following pretrained Robustly Optimized BERT approach
(RoBERTa)
-
Financial Text Embedding (RoBERTa-SEC-Base)
-
RoBERTa-SEC-WIKI-Base
-
RoBERTa-SEC-Large
-
RoBERTa-SEC-WIKI-Large
The RoBERTa-SEC-Base and RoBERTa-SEC-Large models are the text embedding models based
on GluonNLP's RoBERTa model
You can find these models in SageMaker JumpStart by navigating to the Text Models node, choosing Explore All Text Models, and then filtering for the ML Task Text Embedding. You can access any corresponding notebooks after selecting the model of your choice. The paired notebooks will walk you through how the pretrained models can be fine-tuned for specific classification tasks on multimodal datasets, which are enhanced by the SageMaker JumpStart Industry Python SDK.
Note
The model notebooks are for demonstration purposes only. They should not be relied on as financial or investment advice.
The following screenshot shows the pretrained model cards provided through the SageMaker JumpStart page on Studio Classic.
Note
The SageMaker JumpStart Industry: Financial solutions, model cards, and example
notebooks are hosted and runnable only through SageMaker Studio Classic. Log in to the SageMaker console
Amazon SageMaker JumpStart Industry: Financial Example Notebooks
SageMaker JumpStart Industry: Financial provides the following example notebooks to demonstrate solutions to industry-focused ML problems:
-
Financial TabText Data Construction – This example introduces how to use the SageMaker JumpStart Industry Python SDK for processing the SEC filings, such as text summarization and scoring texts based on NLP score types and their corresponding word lists. To preview the content of this notebook, see Simple Construction of a Multimodal Dataset from SEC Filings and NLP Scores
. -
Multimodal ML on TabText Data – This example shows how to merge different types of datasets into a single dataframe called TabText and perform multimodal ML. To preview the content of this notebook, see Machine Learning on a TabText Dataframe – An Example Based on the Paycheck Protection Program
. -
Multi-category ML on SEC filings data – This example shows how to train an AutoGluon NLP model over the multimodal (TabText) datasets curated from SEC filings for a multiclass classification task. Classify SEC 10K/Q Filings to Industry Codes Based on the MDNA Text Column
.
Note
The example notebooks are for demonstrative purposes only. They should not be relied on as financial or investment advice.
Note
The SageMaker JumpStart Industry: Financial solutions, model cards, and example
notebooks are hosted and runnable only through SageMaker Studio Classic. Log in to the SageMaker console
To preview the content of the example notebooks, see Tutorials – Finance
Amazon SageMaker JumpStart Industry: Financial Blog Posts
For thorough applications of using SageMaker JumpStart Industry: Financial solutions, models, examples, and the SDK, see the following blog posts:
Use pre-trained financial language models for transfer learning in Amazon SageMaker JumpStart
Use SEC text for ratings classification using multimodal ML in Amazon SageMaker JumpStart
Create a dashboard with SEC text for financial NLP in Amazon SageMaker JumpStart
Domain-adaptation Fine-tuning of Foundation Models in Amazon SageMaker JumpStart on Financial data
Amazon SageMaker JumpStart Industry: Financial Related Research
For research related to SageMaker JumpStart Industry: Financial solutions, see the following papers:
Amazon SageMaker JumpStart Industry: Financial Additional Resources
For additional documentation and tutorials, see the following resources: