Getting started with model evaluations

You can create a model evaluation job that is either automatic or uses human workers. When you create a model evaluation job, you can define the model used, the inference parameters of the model, the type of task the model tries to perform, and the prompt data used in the job.

Model evaluation jobs support the following task types.

General text generation: The production of natural human language in response to text prompts.
Text summarization: The generation of a summary based on the provided text in your prompt.
Question and answering: The generation of a response to a question within your prompt.
Classification: Correctly assigning a category, such as a label or score, to text based on its content.
CustomYou define the metric, description, and a rating method

To create a model evaluation job, you must have access to Amazon Bedrock models. Model evaluation jobs support using Amazon Bedrock foundation models. To learn more about model access, see Model access.

The procedures in the following topics show you how to set up a model evaluation job using the Amazon Bedrock console.

To create a model evaluation job with the help of an AWS-managed team, choose Create AWS managed evaluation from the AWS Management Console. Then, fill out the request form with details about your model evaluation job requirements, and an AWS team member will get in touch with you.

Topics

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Model evaluation

Automatic model evaluations