Class: Aws::Bedrock::Types::EvaluationDatasetMetricConfig
- Inherits:
-
Struct
- Object
- Struct
- Aws::Bedrock::Types::EvaluationDatasetMetricConfig
- Defined in:
- gems/aws-sdk-bedrock/lib/aws-sdk-bedrock/types.rb
Overview
Defines the prompt datasets, built-in metric names and custom metric names, and the task type.
Constant Summary collapse
- SENSITIVE =
[:metric_names]
Instance Attribute Summary collapse
-
#dataset ⇒ Types::EvaluationDataset
Specifies the prompt dataset.
-
#metric_names ⇒ Array<String>
The names of the metrics you want to use for your evaluation job.
-
#task_type ⇒ String
The the type of task you want to evaluate for your evaluation job.
Instance Attribute Details
#dataset ⇒ Types::EvaluationDataset
Specifies the prompt dataset.
1446 1447 1448 1449 1450 1451 1452 |
# File 'gems/aws-sdk-bedrock/lib/aws-sdk-bedrock/types.rb', line 1446 class EvaluationDatasetMetricConfig < Struct.new( :task_type, :dataset, :metric_names) SENSITIVE = [:metric_names] include Aws::Structure end |
#metric_names ⇒ Array<String>
The names of the metrics you want to use for your evaluation job.
For knowledge base evaluation jobs that evaluate retrieval only,
valid values are "Builtin.ContextRelevance
",
"Builtin.ContextConverage
".
For knowledge base evaluation jobs that evaluate retrieval with
response generation, valid values are "Builtin.Correctness
",
"Builtin.Completeness
", "Builtin.Helpfulness
",
"Builtin.LogicalCoherence
", "Builtin.Faithfulness
",
"Builtin.Harmfulness
", "Builtin.Stereotyping
",
"Builtin.Refusal
".
For automated model evaluation jobs, valid values are
"Builtin.Accuracy
", "Builtin.Robustness
", and
"Builtin.Toxicity
". In model evaluation jobs that use a LLM as
judge you can specify "Builtin.Correctness
",
"Builtin.Completeness"
, "Builtin.Faithfulness"
,
"Builtin.Helpfulness
", "Builtin.Coherence
",
"Builtin.Relevance
", "Builtin.FollowingInstructions
",
"Builtin.ProfessionalStyleAndTone
", You can also specify the
following responsible AI related metrics only for model evaluation
job that use a LLM as judge "Builtin.Harmfulness
",
"Builtin.Stereotyping
", and "Builtin.Refusal
".
For human-based model evaluation jobs, the list of strings must
match the name
parameter specified in
HumanEvaluationCustomMetric
.
1446 1447 1448 1449 1450 1451 1452 |
# File 'gems/aws-sdk-bedrock/lib/aws-sdk-bedrock/types.rb', line 1446 class EvaluationDatasetMetricConfig < Struct.new( :task_type, :dataset, :metric_names) SENSITIVE = [:metric_names] include Aws::Structure end |
#task_type ⇒ String
The the type of task you want to evaluate for your evaluation job. This applies only to model evaluation jobs and is ignored for knowledge base evaluation jobs.
1446 1447 1448 1449 1450 1451 1452 |
# File 'gems/aws-sdk-bedrock/lib/aws-sdk-bedrock/types.rb', line 1446 class EvaluationDatasetMetricConfig < Struct.new( :task_type, :dataset, :metric_names) SENSITIVE = [:metric_names] include Aws::Structure end |