Create a batch inference job - Amazon Bedrock

Create a batch inference job


Batch inference is in preview and is subject to change. Batch inference is currently only available through the API. Access batch APIs through the following SDKs.

We recommend that you create a virtual environment to use the SDK. Because batch inference APIs aren't available in the latest SDKs, we recommend that you uninstall the latest version of the SDK from the virtual environment before installing the version with the batch inference APIs. For a guided example, see Code samples.

Request format
POST /model-invocation-job HTTP/1.1 Content-type: application/json { "clientRequestToken": "string", "inputDataConfig": { "s3InputDataConfig": { "s3Uri": "string", "s3InputFormat": "JSONL" } }, "jobName": "string", "modelId": "string", "outputDataConfig": { "s3OutputDataConfig": { "s3Uri": "string" } }, "roleArn": "string", "tags": [ { "key": "string", "value": "string" } ] }
Response format
HTTP/1.1 200 Content-type: application/json { "jobArn": "string" }

To create a batch inference job, send a CreateModelInvocationJob request. Provide the following information.

  • The ARN of a role with permissions to run batch inference in roleArn.

  • Information for the S3 bucket containing the input data in inputDataConfig and the bucket where to write information in outputDataConfig.

  • The ID of the model to use for inference in modelId (see Amazon Bedrock base model IDs (on-demand throughput) ).

  • A name for the job in jobName.

  • (Optional) Any tags that you want to attach to the job in tags.

The response returns a jobArn that you can use for other batch inference-related API calls.

You can check the status of the job with either the GetModelInvocationJob or ListModelInvocationJobs APIs.

When the job is Completed, you can extract the results of the batch inference job from the files in the S3 bucket you specified in the request for the outputDataConfig. The S3 bucket contains the following files:

  1. Output files containing the result of the model inference.

    • If the output is text, Amazon Bedrock generates an output JSONL file for each input JSONL file. The output files contain outputs from the model for each input in the following format. An error object replaces the modelOutput field in any line where there was an error in inference. The format of the modelOutput JSON object matches the body field for the model that you use in the InvokeModel response. For more information, see Inference parameters for foundation models.

      { "recordId" : "12 character alphanumeric string", "modelInput": {JSON body}, "modelOutput": {JSON body} }

      The following example shows a possible output file.

      { "recordId" : "3223593EFGH", "modelInput" : {"inputText": "Roses are red, violets are"}, "modelOutput" : {'inputTextTokenCount': 8, 'results': [{'tokenCount': 3, 'outputText': 'blue\n', 'completionReason': 'FINISH'}]}} { "recordId" : "1223213ABCDE", "modelInput" : {"inputText": "Hello world"}, "error" : {"errorCode" : 400, "errorMessage" : "bad request" }}
    • If the output is image, Amazon Bedrock generates a file for each image.

  2. A manifest.json.out file containing a summary of the batch inference job.

    { "processedRecordCount" : number, "successRecordCount": number, "errorRecordCount": number, "inputTextTokenCount": number, // For embedding/text to text models "outputTextTokenCount" : number, // For text to text models "outputImgCount512x512pStep50": number, // For text to image models "outputImgCount512x512pStep150" : number, // For text to image models "outputImgCount512x896pStep50" : number, // For text to image models "outputImgCount512x896pStep150" : number // For text to image models }