HumanEvaluationConfig - Amazon Bedrock

HumanEvaluationConfig

Specifies the custom metrics, how tasks will be rated, the flow definition ARN, and your custom prompt datasets. Model evaluation jobs use human workers only support the use of custom prompt datasets. To learn more about custom prompt datasets and the required format, see Custom prompt datasets.

When you create custom metrics in HumanEvaluationCustomMetric you must specify the metric's name. The list of names specified in the HumanEvaluationCustomMetric array, must match the metricNames array of strings specified in EvaluationDatasetMetricConfig. For example, if in the HumanEvaluationCustomMetric array your specified the names "accuracy", "toxicity", "readability" as custom metrics then the metricNames array would need to look like the following ["accuracy", "toxicity", "readability"] in EvaluationDatasetMetricConfig.

Contents

datasetMetricConfigs

Use to specify the metrics, task, and prompt dataset to be used in your model evaluation job.

Type: Array of EvaluationDatasetMetricConfig objects

Array Members: Minimum number of 1 item. Maximum number of 5 items.

Required: Yes

customMetrics

A HumanEvaluationCustomMetric object. It contains the names the metrics, how the metrics are to be evaluated, an optional description.

Type: Array of HumanEvaluationCustomMetric objects

Array Members: Minimum number of 1 item. Maximum number of 10 items.

Required: No

humanWorkflowConfig

The parameters of the human workflow.

Type: HumanWorkflowConfig object

Required: No

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: