Submit a batch of prompts with the OpenAI Batch API
You can run a batch inference job using the OpenAI Create batch API with Amazon Bedrock OpenAI models.
You can call the OpenAI Create batch API in the following ways:
Select a topic to learn more:
Supported models and Regions for the OpenAI batch API
You can use the OpenAI Create batch API with all OpenAI models supported in Amazon Bedrock and in the AWS Regions that support these models. For more information about supported models and regions, see Supported foundation models in Amazon Bedrock.
Prerequisites to use the OpenAI batch API
To see prerequisites for using the OpenAI batch API operations, choose the tab for your preferred method, and then follow the steps:
- OpenAI SDK
-
-
Authentication – The OpenAI SDK only supports authentication with an Amazon Bedrock API key. Generate an Amazon Bedrock API key to authenticate your request. To learn about Amazon Bedrock API keys and how to generate them, see Generate Amazon Bedrock API keys to easily authenticate to the Amazon Bedrock API.
-
Endpoint – Find the endpoint that corresponds to the AWS Region to use in Amazon Bedrock Runtime endpoints and quotas. If you use an AWS SDK, you might only need to specify the region code and not the whole endpoint when you set up the client.
-
Model access – Request access to an Amazon Bedrock model that supports this feature. For more information, see Add or remove access to Amazon Bedrock foundation models.
-
Install an OpenAI SDK – For more information, see Libraries in the OpenAI documentation.
-
Batch JSONL file uploaded to S3 – Follow the steps at Prepare your batch file in the OpenAI documentation to prepare your batch file with the correct format. Then upload it to an Amazon S3 bucket.
-
IAM permissions – Make sure that you have the following IAM identities with the proper permissions:
- HTTP request
-
-
Authentication – You can authenticate with either your AWS credentials or with an Amazon Bedrock API key.
Set up your AWS credentials or generate an Amazon Bedrock API key to authenticate your request.
-
Endpoint – Find the endpoint that corresponds to the AWS Region to use in Amazon Bedrock Runtime endpoints and quotas. If you use an AWS SDK, you might only need to specify the region code and not the whole endpoint when you set up the client.
-
Model access – Request access to an Amazon Bedrock model that supports this feature. For more information, see Add or remove access to Amazon Bedrock foundation models.
-
Batch JSONL file uploaded to S3 – Follow the steps at Prepare your batch file in the OpenAI documentation to prepare your batch file with the correct format. Then upload it to an Amazon S3 bucket.
-
IAM permissions – Make sure that you have the following IAM identities with the proper permissions:
Create an OpenAI batch job
For details about the OpenAI Create batch API, refer to the following resources in the OpenAI documentation:
-
Create batch – Details both the request and response.
-
The request output object – Details the fields of the generated output from the batch job. Refer to this documentation when interpreting the results in your S3 bucket.
Form the request
When forming the batch inference request, note the following Amazon Bedrock-specific fields and values:
Find the generated results
The creation response includes a batch ID. The results and error logging of the batch inference job are written to the S3 folder containing the input file. The results will be in a folder with the same name as the batch ID, as in the following folder structure:
---- {batch_input_folder}
|---- {batch_input}.jsonl
|---- {batch_id}
|---- {batch_input}.jsonl.out
|---- {batch_input}.jsonl.err
To see examples of using the OpenAI Create batch API with different methods, choose the tab for your preferred method, and then follow the steps:
- OpenAI SDK (Python)
-
To create a batch job with the OpenAI SDK, do the following:
-
Import the OpenAI SDK and set up the client with the following fields:
-
base_url
– Prefix the Amazon Bedrock Runtime endpoint to /openai/v1
, as in the following format:
https://${bedrock-runtime-endpoint}
/openai/v1
-
api_key
– Specify an Amazon Bedrock API key.
-
default_headers
– If you need to include any headers, you can include them as key-value pairs in this object. You can alternatively specify headers in the extra_headers
when making a specific API call.
-
Use the batches.create() method with the client.
Before running the following example, replace the placeholders in the following fields:
-
api_key – Replace $AWS_BEARER_TOKEN_BEDROCK
with your actual API key.
-
X-Amzn-BedrockRoleArn – Replace arn:aws:iam::123456789012:role/BatchServiceRole
with the actual batch inference service role you set up.
-
input_file_id – Replace s3://amzn-s3-demo-bucket/openai-input.jsonl
with the actual S3 URI to which you uploaded your batch JSONL file.
The example calls the OpenAI Create batch job API in us-west-2
and includes one piece of metadata.
from openai import OpenAI
client = OpenAI(
base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1",
api_key="$AWS_BEARER_TOKEN_BEDROCK", # Replace with actual API key
default_headers={
"X-Amzn-Bedrock-RoleArn": "arn:aws:iam::123456789012:role/BatchServiceRole" # Replace with actual service role ARN
}
)
job = client.batches.create(
input_file_id="s3://amzn-s3-demo-bucket/openai-input.jsonl", # Replace with actual S3 URI
endpoint="/v1/chat/completions",
completion_window="24h",
metadata={
"description": "test input"
},
extra_headers={
"X-Amzn-Bedrock-ModelId": "openai.gpt-oss-20b-1:0",
}
)
print(job)
- HTTP request
-
To create a chat completion with a direct HTTP request, do the following:
-
Use the POST method and specify the URL by prefixing the Amazon Bedrock Runtime endpoint to /openai/v1/batches
, as in the following format:
https://${bedrock-runtime-endpoint}
/openai/v1/batches
-
Specify your AWS credentials or an Amazon Bedrock API key in the Authorization
header.
Before running the following example, first replace the placeholders in the following fields:
-
Authorization – Replace $AWS_BEARER_TOKEN_BEDROCK
with your actual API key.
-
X-Amzn-BedrockRoleArn – Replace arn:aws:iam::123456789012:role/BatchServiceRole
with the actual batch inference service role you set up.
-
input_file_id – Replace s3://amzn-s3-demo-bucket/openai-input.jsonl
with the actual S3 URI to which you uploaded your batch JSONL file.
The following example calls the Create chat completion API in us-west-2
and includes one piece of metadata:
curl -X POST 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches' \
-H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK' \
-H 'Content-Type: application/json' \
-H 'X-Amzn-Bedrock-ModelId: openai.gpt-oss-20b-1:0' \
-H 'X-Amzn-Bedrock-RoleArn: arn:aws:iam::123456789012:role/BatchServiceRole' \
-d '{
"input_file_id": "s3://amzn-s3-demo-bucket/openai-input.jsonl",
"endpoint": "/v1/chat/completions",
"completion_window": "24h",
"metadata": {"description": "test input"}
}'
Retrieve an OpenAI batch job
For details about the OpenAI Retrieve batch API request and response, refer to Retrieve batch.
When you make the request, you specify the ID of the batch job for which to get information. The response returns information about a batch job, including the output and error file names that you can look up in your S3 buckets.
To see examples of using the OpenAI Retrieve batch API with different methods, choose the tab for your preferred method, and then follow the steps:
- OpenAI SDK (Python)
-
To retrieve a batch job with the OpenAI SDK, do the following:
-
Import the OpenAI SDK and set up the client with the following fields:
-
base_url
– Prefix the Amazon Bedrock Runtime endpoint to /openai/v1
, as in the following format:
https://${bedrock-runtime-endpoint}
/openai/v1
-
api_key
– Specify an Amazon Bedrock API key.
-
default_headers
– If you need to include any headers, you can include them as key-value pairs in this object. You can alternatively specify headers in the extra_headers
when making a specific API call.
-
Use the batches.retrieve() method with the client and specify the ID of the batch for which to retrieve information.
Before running the following example, replace the placeholders in the following fields:
The example calls the OpenAI Retrieve batch job API in us-west-2
on a batch job whose ID is batch_abc123
.
from openai import OpenAI
client = OpenAI(
base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1",
api_key="$AWS_BEARER_TOKEN_BEDROCK" # Replace with actual API key
)
job = client.batches.retrieve(batch_id="batch_abc123") # Replace with actual ID
print(job)
- HTTP request
-
To retrieve a batch job with a direct HTTP request, do the following:
-
Use the GET method and specify the URL by prefixing the Amazon Bedrock Runtime endpoint to /openai/v1/batches/${batch_id}
, as in the following format:
https://${bedrock-runtime-endpoint}
/openai/v1/batches/batch_abc123
-
Specify your AWS credentials or an Amazon Bedrock API key in the Authorization
header.
Before running the following example, first replace the placeholders in the following fields:
-
Authorization – Replace $AWS_BEARER_TOKEN_BEDROCK
with your actual API key.
-
batch_abc123 – In the path, replace this value with the actual ID of your batch job.
The following example calls the OpenAI Retrieve batch API in us-west-2
on a batch job whose ID is batch_abc123
.
curl -X GET 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches/batch_abc123' \
-H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK'
List OpenAI batch jobs
For details about the OpenAI List batches API request and response, refer to List batches. The response returns an array of information about your batch jobs.
When you make the request, you can include query parameters to filter the results. The response returns information about a batch job, including the output and error file names that you can look up in your S3 buckets.
To see examples of using the OpenAI List batches API with different methods, choose the tab for your preferred method, and then follow the steps:
- OpenAI SDK (Python)
-
To list batch jobs with the OpenAI SDK, do the following:
-
Import the OpenAI SDK and set up the client with the following fields:
-
base_url
– Prefix the Amazon Bedrock Runtime endpoint to /openai/v1
, as in the following format:
https://${bedrock-runtime-endpoint}
/openai/v1
-
api_key
– Specify an Amazon Bedrock API key.
-
default_headers
– If you need to include any headers, you can include them as key-value pairs in this object. You can alternatively specify headers in the extra_headers
when making a specific API call.
-
Use the batches.list() method with the client. You can include any of the optional parameters.
Before running the following example, replace the placeholders in the following fields:
The example calls the OpenAI List batch jobs API in us-west-2
and specifies a limit of 2 results to return.
from openai import OpenAI
client = OpenAI(
base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1",
api_key="$AWS_BEARER_TOKEN_BEDROCK" # Replace with actual API key
)
job = client.batches.retrieve(batch_id="batch_abc123") # Replace with actual ID
print(job)
- HTTP request
-
To list batch jobs with a direct HTTP request, do the following:
-
Use the GET method and specify the URL by prefixing the Amazon Bedrock Runtime endpoint to /openai/v1/batches
, as in the following format:
https://${bedrock-runtime-endpoint}
/openai/v1/batches
You can include any of the optional query parameters.
-
Specify your AWS credentials or an Amazon Bedrock API key in the Authorization
header.
Before running the following example, first replace the placeholders in the following fields:
The following example calls the OpenAI List batches API in us-west-2
and specifies a limit of 2 results to return.
curl -X GET 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches/batch_abc123' \
-H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK'
Cancel an OpenAI batch job
For details about the OpenAI Cancel batch API request and response, refer to Cancel batch. The response returns information about the cancelled batch job.
When you make the request, you specify the ID of the batch job that you want to cancel.
To see examples of using the OpenAI Cancel batch API with different methods, choose the tab for your preferred method, and then follow the steps:
- OpenAI SDK (Python)
-
To cancel a batch job with the OpenAI SDK, do the following:
-
Import the OpenAI SDK and set up the client with the following fields:
-
base_url
– Prefix the Amazon Bedrock Runtime endpoint to /openai/v1
, as in the following format:
https://${bedrock-runtime-endpoint}
/openai/v1
-
api_key
– Specify an Amazon Bedrock API key.
-
default_headers
– If you need to include any headers, you can include them as key-value pairs in this object. You can alternatively specify headers in the extra_headers
when making a specific API call.
-
Use the batches.cancel() method with the client and specify the ID of the batch for which to retrieve information.
Before running the following example, replace the placeholders in the following fields:
The example calls the OpenAI Cancel batch job API in us-west-2
on a batch job whose ID is batch_abc123
.
from openai import OpenAI
client = OpenAI(
base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1",
api_key="$AWS_BEARER_TOKEN_BEDROCK" # Replace with actual API key
)
job = client.batches.cancel(batch_id="batch_abc123") # Replace with actual ID
print(job)
- HTTP request
-
To cancel a batch job with a direct HTTP request, do the following:
-
Use the POST method and specify the URL by prefixing the Amazon Bedrock Runtime endpoint to /openai/v1/batches/${batch_id}
/cancel
, as in the following format:
https://${bedrock-runtime-endpoint}
/openai/v1/batches/batch_abc123
/cancel
-
Specify your AWS credentials or an Amazon Bedrock API key in the Authorization
header.
Before running the following example, first replace the placeholders in the following fields:
-
Authorization – Replace $AWS_BEARER_TOKEN_BEDROCK
with your actual API key.
-
batch_abc123 – In the path, replace this value with the actual ID of your batch job.
The following example calls the OpenAI Cancel batch API in us-west-2
on a batch job whose ID is batch_abc123
.
curl -X GET 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches/batch_abc123/cancel' \
-H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK'