Interface ILlmAsAJudgeOptions
Options for configuring an LLM-as-a-Judge custom evaluator.
Namespace: Amazon.CDK.AWS.BedrockAgentCore
Assembly: Amazon.CDK.Lib.dll
Syntax (csharp)
public interface ILlmAsAJudgeOptions
Syntax (vb)
Public Interface ILlmAsAJudgeOptions
Remarks
Uses a foundation model to assess agent performance based on custom instructions and a rating scale.
ExampleMetadata: infused
Examples
// Create a custom LLM-as-a-Judge evaluator
var evaluator = new Evaluator(this, "MyEvaluator", new EvaluatorProps {
EvaluatorName = "my_custom_evaluator",
Level = EvaluationLevel.SESSION,
EvaluatorConfig = EvaluatorConfig.LlmAsAJudge(new LlmAsAJudgeOptions {
Instructions = "Evaluate whether the agent response is helpful and accurate.",
ModelId = "us.anthropic.claude-sonnet-4-6",
RatingScale = EvaluatorRatingScale.Categorical(new [] { new CategoricalRatingOption { Label = "Good", Definition = "The response is helpful and accurate." }, new CategoricalRatingOption { Label = "Bad", Definition = "The response is not helpful or contains errors." } })
})
});
// Use the custom evaluator in an online evaluation configuration
// Use the custom evaluator in an online evaluation configuration
new OnlineEvaluationConfig(this, "MyEvaluation", new OnlineEvaluationConfigProps {
OnlineEvaluationConfigName = "my_evaluation",
Evaluators = new [] { EvaluatorSelector.Builtin(BuiltinEvaluator.HELPFULNESS), EvaluatorSelector.Custom(evaluator) },
DataSource = DataSourceConfig.FromCloudWatchLogs(new CloudWatchLogsDataSourceConfig {
LogGroupNames = new [] { "/aws/bedrock-agentcore/my-agent" },
ServiceNames = new [] { "my-agent.default" }
})
});
Synopsis
Properties
| AdditionalModelRequestFields | Additional model-specific request fields. |
| InferenceConfig | Optional inference configuration parameters that control model behavior during evaluation. |
| Instructions | The evaluation instructions that guide the language model in assessing agent performance. |
| ModelId | The identifier of the Amazon Bedrock model to use for evaluation. |
| RatingScale | The rating scale that defines how the evaluator should score agent performance. |
Properties
AdditionalModelRequestFields
Additional model-specific request fields.
IDictionary<string, object>? AdditionalModelRequestFields { get; }
Property Value
Remarks
Default: - No additional fields
InferenceConfig
Optional inference configuration parameters that control model behavior during evaluation.
IEvaluatorInferenceConfig? InferenceConfig { get; }
Property Value
Remarks
When not specified, the foundation model uses its own default values for maxTokens, temperature, and topP.
Default: - The foundation model's default inference parameters are used
See: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/custom-evaluators.html
Instructions
The evaluation instructions that guide the language model in assessing agent performance.
string Instructions { get; }
Property Value
Remarks
These instructions define the evaluation criteria, context, and expected behavior.
Instructions must contain placeholders appropriate for the evaluation level
(e.g., {context}, {available_tools} for SESSION level).
Note: Evaluators using reference-input placeholders (e.g., {expected_tool_trajectory},
{assertions}, {expected_response}) are only compatible with on-demand evaluation,
not online evaluation.
See: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/custom-evaluators.html
ModelId
The identifier of the Amazon Bedrock model to use for evaluation.
string ModelId { get; }
Property Value
Remarks
Accepts standard model IDs (e.g., 'anthropic.claude-sonnet-4-6')
and cross-region inference profile IDs with region prefixes
(e.g., 'us.anthropic.claude-sonnet-4-6', 'eu.anthropic.claude-sonnet-4-6').
RatingScale
The rating scale that defines how the evaluator should score agent performance.
EvaluatorRatingScale RatingScale { get; }
Property Value
Remarks
Uses a foundation model to assess agent performance based on custom instructions and a rating scale.
ExampleMetadata: infused