Import a customized model to Amazon Bedrock
Custom Model Import is in preview release for Amazon Bedrock and is subject to change. |
You can create a custom model in Amazon Bedrock by using the Custom Model Import feature to import Foundation Models that you have customized in other environments, such as Amazon SageMaker. For example, you might have a model that you have created in Amazon SageMaker that has proprietary model weights. You can now import that model into Amazon Bedrock and then leverage Amazon Bedrock features to make inference calls to the model.
You can use a model that you import with on demand throughput. Use the InvokeModel or InvokeModelWithResponseStream operations to make inference calls to the model. For more information, see Submit a single prompt with the InvokeModel API operations.
Note
For the preview release, Custom Model Import is available in the US East (N. Virginia) and US West (Oregon) AWS Regions only. You can't use Custom Model Import with the following Amazon Bedrock features.
Amazon Bedrock Agents
Amazon Bedrock Knowledge Bases
Amazon Bedrock Guardrails
Batch inference
AWS CloudFormation
Before you can use Custom Model Import, you must first request a quota increase for
the Imported models per account
quota. For more information, see Requesting a quota
increase.
With Custom Model Import you can create a custom model that supports the following patterns.
-
Fine-tuned or Continued Pre-training model — You can customize the model weights using proprietary data, but retain the configuration of the base model.
-
Adaptation You can customize the model to your domain for use cases where the model doesn't generalize well. Domain adaptation modifies a model to generalize for a target domain and deal with discrepancies across domains, such as a financial industry wanting to create a model which generalizes well on pricing. Another example is language adaptation. For example you could customize a model to generate responses in Portuguese or Tamil. Most often, this involves changes to the vocabulary of the model that you are using.
-
Pretrained from scratch — In addition to customizing the weights and vocabulary of the model, you can also change model configuration parameters such as the number of attention heads, hidden layers, or context length.
Supported architectures
The model you import must be in one of the following architectures.
-
Mistral — A decoder-only Transformer based architecture with Sliding Window Attention (SWA) and options for Grouped Query Attention (GQA). For more information, see Mistral
in the Hugging Face documentation. -
Flan — An enhanced version of the T5 architecture, an encoder-decoder based transformer model. For more information, see Flan T5
in the Hugging Face documentation. -
Llama 2 and Llama3 — An improved version of Llama with Grouped Query Attention (GQA). For more information, see Llama 2
and Llama 3 in the Hugging Face documentation.
Import source
You import a model into Amazon Bedrock by creating a model import job in the Amazon Bedrock console. In the job you specify the Amazon S3 URI for the source of the model files. Alternatively, if you created the model in Amazon SageMaker, you can specify the SageMaker model. During model training, the import job automatically detects your model's architecture.
If you import from an Amazon S3 bucket, you need to supply the model files in the
Hugging Face weights format. You can create the files by using the
Hugging Face transformer library. To create model files for a Llama
model, see convert_llama_weights_to_hf.py
To import the model from Amazon S3, you minimally need the following files that the Hugging Face transformer library creates.
-
.safetensor — the model weights in Safetensor format. Safetensors is a format created by Hugging Face that stores a model weights as tensors. You must store the tensors for your model in a file with the extension
.safetensors
. For more information, see Safetensors. For information about converting model weights to Safetensor format, see Convert weights to safetensors . Note
Currently, Amazon Bedrock only supports model weights with FP32, FP16, and BF16 precision. Amazon Bedrock will reject model weights if you supply them with any other precision. Internally Amazon Bedrock will convert FP32 models to BF16 precision.
Amazon Bedrock doesn't support the import of quantized models.
config.json — For examples, see LlamaConfig
and MistralConfig . -
tokenizer_config.json — For an example, see LlamaTokenizer
. tokenizer.json
tokenizer.model
Importing a model
The following procedure shows you how to create a custom model by importing a model that you have already customized. The model import job can take several minutes. During the import job, Amazon Bedrock validates that the model uses a compatible model architecture.
To submit a model import job, carry out the following steps.
Request a quota increase for the
Imported models per account
quota. For more information, see Requesting a quota increase.-
If you are importing your model files from Amazon S3, convert the model to the Hugging Face format.
If your model is a Mistral AI model, use convert_mistral_weights_to_hf.py
. -
If your model is a Llama model, see convert_llama_weights_to_hf.py
. Upload the model files to an Amazon S3 bucket in your AWS account. For more information, see Upload an object to your bucket.
-
Sign in to the AWS Management Console using an IAM role with Amazon Bedrock permissions, and open the Amazon Bedrock console at https://console.aws.amazon.com/bedrock/
. -
Choose Imported models under Foundation models from the left navigation pane.
Choose the Models tab.
Choose Import model.
-
In the Imported tab, choose Import model to open the Import model page.
-
In the Model details section, do the following:
-
In Model name enter a name for the model.
-
(Optional) To associate tags with the model, expand the Tags section and select Add new tag.
-
-
In the Import job name section, do the following:
-
In Job name enter a name for the model import job.
-
(Optional) To associate tags with the custom model, expand the Tags section and select Add new tag.
-
In Model import settings, select the import options you want to use.
-
If you are importing your model files from an Amazon S3 bucket, choose Amazon S3 bucket and enter the Amazon S3 location in S3 location. Optionally, you can choose Browse S3 to choose the file location.
-
If you are importing your model from Amazon SageMaker, choose Amazon SageMaker model and then choose the SageMaker model that you want to import in SageMaker models.
-
-
In the Service access section, select one of the following:
-
Create and use a new service role – Enter a name for the service role.
-
Use an existing service role – Select a service role from the drop-down list. To see the permissions that your existing service role needs, choose View permission details.
For more information on setting up a service role with the appropriate permissions, see Create a service role for model import.
-
-
Choose Import.
On the Custom models page, choose Imported.
-
In the Jobs section, check the status of the import job. The model name you chose identifies the model import job. The job is complete if the value of Status for the model is Complete.
-
Get the model ID for your model by doing the following.
-
On the Imported models page, choose the Models tab.
-
Copy the ARN for the model that you want to use from the ARN column.
-
-
Use your model for inference calls. For more information, see Submit a single prompt with the InvokeModel API operations. You can use the model with on demand throughput.
You can also use your model in the Amazon Bedrock text playground.