EvaluatorProps

class aws_cdk.aws_bedrock_agentcore_alpha.EvaluatorProps(*, evaluator_config, evaluator_name, level, description=None)

Bases: object

(experimental) Properties for creating an Evaluator.

Parameters:

evaluator_config (EvaluatorConfig) – (experimental) The configuration that defines how the evaluator assesses agent performance. Use EvaluatorConfig.llmAsAJudge() for model-based evaluation or EvaluatorConfig.codeBased() for Lambda-based evaluation.
evaluator_name (str) – (experimental) The name of the evaluator. Must be unique within your account. Valid characters are a-z, A-Z, 0-9, _ (underscore). Must start with a letter and can be up to 48 characters long.
level (EvaluationLevel) – (experimental) The level at which the evaluator assesses agent performance. Determines what granularity of data the evaluator operates on: tool call, trace (single request-response), or session (full conversation).
description (Optional[str]) – (experimental) The description of the evaluator. Default: - No description

Stability:

experimental

ExampleMetadata:

infused

Example:

# Create a custom LLM-as-a-Judge evaluator
evaluator = agentcore.Evaluator(self, "MyEvaluator",
    evaluator_name="my_custom_evaluator",
    level=agentcore.EvaluationLevel.SESSION,
    evaluator_config=agentcore.EvaluatorConfig.llm_as_aJudge(
        instructions="Evaluate whether the agent response is helpful and accurate.",
        model_id="us.anthropic.claude-sonnet-4-6",
        rating_scale=agentcore.EvaluatorRatingScale.categorical([label="Good", definition="The response is helpful and accurate.", label="Bad", definition="The response is not helpful or contains errors."
        ])
    )
)

# Use the custom evaluator in an online evaluation configuration
agentcore.OnlineEvaluationConfig(self, "MyEvaluation",
    online_evaluation_config_name="my_evaluation",
    evaluators=[
        agentcore.EvaluatorReference.builtin(agentcore.BuiltinEvaluator.HELPFULNESS),
        agentcore.EvaluatorReference.custom(evaluator)
    ],
    data_source=agentcore.DataSourceConfig.from_cloud_watch_logs(
        log_group_names=["/aws/bedrock-agentcore/my-agent"],
        service_names=["my-agent.default"]
    )
)

Attributes

description

(experimental) The description of the evaluator.

Default:

No description

Stability:

experimental

MaxLength:

200

evaluator_config

(experimental) The configuration that defines how the evaluator assesses agent performance.

Use EvaluatorConfig.llmAsAJudge() for model-based evaluation or EvaluatorConfig.codeBased() for Lambda-based evaluation.

Stability:: experimental

evaluator_name

(experimental) The name of the evaluator.

Must be unique within your account. Valid characters are a-z, A-Z, 0-9, _ (underscore). Must start with a letter and can be up to 48 characters long.

Stability:: experimental
Pattern:: ^[a-zA-Z][a-zA-Z0-9_]{0,47}$

level

(experimental) The level at which the evaluator assesses agent performance.

Determines what granularity of data the evaluator operates on: tool call, trace (single request-response), or session (full conversation).

Stability:: experimental