SageMaker / Client / create_ai_benchmark_job

create_ai_benchmark_job

SageMaker.Client.create_ai_benchmark_job(**kwargs)

Creates a benchmark job that runs performance benchmarks against inference infrastructure using a predefined AI workload configuration. The benchmark job measures metrics such as latency, throughput, and cost for your generative AI inference endpoints.

See also: AWS API Documentation

Request Syntax

response = client.create_ai_benchmark_job(
    AIBenchmarkJobName='string',
    BenchmarkTarget={
        'Endpoint': {
            'Identifier': 'string',
            'TargetContainerHostname': 'string',
            'InferenceComponents': [
                {
                    'Identifier': 'string'
                },
            ]
        }
    },
    OutputConfig={
        'S3OutputLocation': 'string'
    },
    AIWorkloadConfigIdentifier='string',
    RoleArn='string',
    NetworkConfig={
        'VpcConfig': {
            'SecurityGroupIds': [
                'string',
            ],
            'Subnets': [
                'string',
            ]
        }
    },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
Parameters:
  • AIBenchmarkJobName (string) –

    [REQUIRED]

    The name of the AI benchmark job. The name must be unique within your Amazon Web Services account in the current Amazon Web Services Region.

  • BenchmarkTarget (dict) –

    [REQUIRED]

    The target endpoint to benchmark. Specify a SageMaker endpoint by providing its name or Amazon Resource Name (ARN).

    Note

    This is a Tagged Union structure. Only one of the following top level keys can be set: Endpoint.

    • Endpoint (dict) –

      The SageMaker endpoint to benchmark.

      • Identifier (string) – [REQUIRED]

        The name or Amazon Resource Name (ARN) of the SageMaker endpoint to benchmark.

      • TargetContainerHostname (string) –

        The hostname of the specific container to target within a multi-container endpoint.

      • InferenceComponents (list) –

        The list of inference components to benchmark on the endpoint.

        • (dict) –

          An inference component to benchmark.

          • Identifier (string) – [REQUIRED]

            The name or Amazon Resource Name (ARN) of the inference component.

  • OutputConfig (dict) –

    [REQUIRED]

    The output configuration for the benchmark job, including the Amazon S3 location where benchmark results are stored.

    • S3OutputLocation (string) – [REQUIRED]

      The Amazon S3 URI where benchmark results are stored.

  • AIWorkloadConfigIdentifier (string) –

    [REQUIRED]

    The name or Amazon Resource Name (ARN) of the AI workload configuration to use for this benchmark job.

  • RoleArn (string) –

    [REQUIRED]

    The Amazon Resource Name (ARN) of an IAM role that enables Amazon SageMaker AI to perform tasks on your behalf.

  • NetworkConfig (dict) –

    The network configuration for the benchmark job, including VPC settings.

    • VpcConfig (dict) –

      The VPC configuration, including security group IDs and subnet IDs.

      • SecurityGroupIds (list) – [REQUIRED]

        The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

        • (string) –

      • Subnets (list) – [REQUIRED]

        The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.

        • (string) –

  • Tags (list) –

    The metadata that you apply to Amazon Web Services resources to help you categorize and organize them. Each tag consists of a key and a value, both of which you define.

    • (dict) –

      A tag object that consists of a key and an optional value, used to manage metadata for SageMaker Amazon Web Services resources.

      You can add tags to notebook instances, training jobs, hyperparameter tuning jobs, batch transform jobs, models, labeling jobs, work teams, endpoint configurations, and endpoints. For more information on adding tags to SageMaker resources, see AddTags.

      For more information on adding metadata to your Amazon Web Services resources with tagging, see Tagging Amazon Web Services resources. For advice on best practices for managing Amazon Web Services resources with tagging, see Tagging Best Practices: Implement an Effective Amazon Web Services Resource Tagging Strategy.

      • Key (string) – [REQUIRED]

        The tag key. Tag keys must be unique per resource.

      • Value (string) – [REQUIRED]

        The tag value.

Return type:

dict

Returns:

Response Syntax

{
    'AIBenchmarkJobArn': 'string'
}

Response Structure

  • (dict) –

    • AIBenchmarkJobArn (string) –

      The Amazon Resource Name (ARN) of the created benchmark job.

Exceptions

  • SageMaker.Client.exceptions.ResourceNotFound

  • SageMaker.Client.exceptions.ResourceInUse

  • SageMaker.Client.exceptions.ResourceLimitExceeded