EvaluatorProps
- class aws_cdk.aws_bedrock_agentcore_alpha.EvaluatorProps(*, evaluator_config, evaluator_name, level, description=None)
Bases:
object(experimental) Properties for creating an Evaluator.
- Parameters:
evaluator_config (
EvaluatorConfig) – (experimental) The configuration that defines how the evaluator assesses agent performance. UseEvaluatorConfig.llmAsAJudge()for model-based evaluation orEvaluatorConfig.codeBased()for Lambda-based evaluation.evaluator_name (
str) – (experimental) The name of the evaluator. Must be unique within your account. Valid characters are a-z, A-Z, 0-9, _ (underscore). Must start with a letter and can be up to 48 characters long.level (
EvaluationLevel) – (experimental) The level at which the evaluator assesses agent performance. Determines what granularity of data the evaluator operates on: tool call, trace (single request-response), or session (full conversation).description (
Optional[str]) – (experimental) The description of the evaluator. Default: - No description
- Stability:
experimental
- ExampleMetadata:
infused
Example:
# Create a custom LLM-as-a-Judge evaluator evaluator = agentcore.Evaluator(self, "MyEvaluator", evaluator_name="my_custom_evaluator", level=agentcore.EvaluationLevel.SESSION, evaluator_config=agentcore.EvaluatorConfig.llm_as_aJudge( instructions="Evaluate whether the agent response is helpful and accurate.", model_id="us.anthropic.claude-sonnet-4-6", rating_scale=agentcore.EvaluatorRatingScale.categorical([label="Good", definition="The response is helpful and accurate.", label="Bad", definition="The response is not helpful or contains errors." ]) ) ) # Use the custom evaluator in an online evaluation configuration agentcore.OnlineEvaluationConfig(self, "MyEvaluation", online_evaluation_config_name="my_evaluation", evaluators=[ agentcore.EvaluatorReference.builtin(agentcore.BuiltinEvaluator.HELPFULNESS), agentcore.EvaluatorReference.custom(evaluator) ], data_source=agentcore.DataSourceConfig.from_cloud_watch_logs( log_group_names=["/aws/bedrock-agentcore/my-agent"], service_names=["my-agent.default"] ) )
Attributes
- description
(experimental) The description of the evaluator.
- Default:
No description
- Stability:
experimental
- MaxLength:
200
- evaluator_config
(experimental) The configuration that defines how the evaluator assesses agent performance.
Use
EvaluatorConfig.llmAsAJudge()for model-based evaluation orEvaluatorConfig.codeBased()for Lambda-based evaluation.- Stability:
experimental
- evaluator_name
(experimental) The name of the evaluator.
Must be unique within your account. Valid characters are a-z, A-Z, 0-9, _ (underscore). Must start with a letter and can be up to 48 characters long.
- Stability:
experimental
- Pattern:
^[a-zA-Z][a-zA-Z0-9_]{0,47}$
- level
(experimental) The level at which the evaluator assesses agent performance.
Determines what granularity of data the evaluator operates on: tool call, trace (single request-response), or session (full conversation).
- Stability:
experimental