本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。
OpenAI 模型
OpenAI 提供下列開放權重模型:
下表摘要說明模型的相關資訊:
資訊 |
gpt-oss-20b |
gpt-oss-120b |
版本日期 |
2025 年 8 月 5 日 |
2025 年 8 月 5 日 |
模型 ID |
openai.gpt-oss-20b-1:0 |
openai.gpt-oss-120b-1:0 |
產品 ID |
N/A |
N/A |
支援的輸入模式 |
文字 |
文字 |
支援的輸出模式 |
文字 |
文字 |
內容視窗 |
128,000 |
128,000 |
這些OpenAI模型支援下列功能:
OpenAI 請求內文
如需有關請求內文中參數及其描述的資訊,請參閱 OpenAI 文件中的建立聊天完成。
以下列方式使用請求內文欄位:
OpenAI 回應內文
OpenAI 模型的回應內文符合 傳回的聊天完成物件OpenAI。如需回應欄位的詳細資訊,請參閱 OpenAI 文件中的聊天完成物件。
如果您使用 InvokeModel
,由<reasoning>
標籤包圍的模型推理會先於回應的文字內容。
OpenAI 模型的範例用量
本節提供一些如何使用OpenAI模型的範例。
在嘗試這些範例之前,請檢查您是否符合先決條件:
-
身分驗證 – 您可以使用您的 AWS 登入資料或 Amazon Bedrock API 金鑰進行身分驗證。
設定您的 AWS 登入資料或產生 Amazon Bedrock API 金鑰來驗證您的請求。
如果您使用OpenAI聊天完成 API,您只能使用 Amazon Bedrock API 金鑰進行身分驗證。
-
端點 – 尋找與要在 Amazon Bedrock 執行期端點和配額中使用的 AWS 區域對應的端點。 https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-rt如果您使用 AWS SDK,您可能只需要在設定用戶端時指定區域碼,而不是整個端點。您必須使用與範例中所用模型支援的區域相關聯的端點。
-
模型存取 – 請求存取 OpenAI模型。如需詳細資訊,請參閱新增或移除對 Amazon Bedrock 基礎模型的存取權。
-
(如果範例使用 SDK) 安裝 SDK – 安裝之後,請設定預設登入資料和預設 AWS 區域。如果您未設定預設登入資料或區域,則必須在相關程式碼範例中明確指定這些登入資料。如需標準化憑證提供者的詳細資訊,請參閱 AWS SDKs和工具標準化憑證提供者。
如果您使用 OpenAI SDK,您只能使用 Amazon Bedrock API 金鑰進行身分驗證,而且必須明確設定 Amazon Bedrock 端點。
展開您要查看的範例的 區段:
若要查看使用OpenAI建立聊天完成 API 的範例,請選擇您偏好方法的索引標籤,然後遵循以下步驟:
- OpenAI SDK (Python)
-
下列 Python 指令碼會使用 OpenAI Python SDK 呼叫建立聊天完成 API:
from openai import OpenAI
client = OpenAI(
base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1",
api_key="$AWS_BEARER_TOKEN_BEDROCK"
)
completion = client.chat.completions.create(
model="openai.gpt-oss-20b-1:0",
messages=[
{
"role": "developer",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
)
print(completion.choices[0].message)
- HTTP request using curl
-
您可以在終端機中執行下列命令,以使用 curl 呼叫建立聊天完成 API:
curl -X POST https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK" \
-d '{
"model": "openai.gpt-oss-20b-1:0",
"messages": [
{
"role": "developer",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}'
選擇您偏好方法的索引標籤,然後遵循下列步驟:
- Python
-
import boto3
import json
# Initialize the Bedrock Runtime client
client = boto3.client('bedrock-runtime')
# Model ID
model_id = 'openai.gpt-oss-20b-1:0'
# Create the request body
native_request = {
"model": model_id, # You can omit this field
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "assistant",
"content": "Hello! How can I help you today?"
},
{
"role": "user",
"content": "What is the weather like today?"
}
],
"max_completion_tokens": 150,
"temperature": 0.7,
"top_p": 0.9,
"stream": False # You can omit this field
}
# Make the InvokeModel request
response = client.invoke_model(
modelId=model_id,
body=json.dumps(native_request)
)
# Parse and print the message for each choice in the chat completion
response_body = json.loads(response['body'].read().decode('utf-8'))
for choice in response_body['choices']:
print(choice['message']['content'])
當您使用統一的 Converse API 時,您需要將OpenAI建立聊天完成欄位映射至 Converse 請求內文中的對應欄位。
例如,將下列聊天完成請求內文與其對應的 Converse 請求內文進行比較:
- Create chat completion request body
-
{
"model": "openai.gpt-oss-20b-1:0",
"messages": [
{
"role": "developer",
"content": "You are a helpful assistant."
},
{
"role": "assistant",
"content": "Hello! How can I help you today?"
},
{
"role": "user",
"content": "What is the weather like today?"
}
],
"max_completion_tokens": 150,
"temperature": 0.7
}
- Converse request body
-
{
"messages": [
{
"role": "user",
"content": [
{
"text": "Hello! How can I help you today?"
}
]
},
{
"role": "user",
"content": [
{
"text": "What is the weather like today?"
}
]
}
],
"system": [
{
"text": "You are a helpful assistant."
}
],
"inferenceConfig": {
"maxTokens": 150,
"temperature": 0.7
}
}
選擇您偏好方法的索引標籤,然後遵循下列步驟:
- Python
-
# Use the Conversation API to send a text message to Anthropic Claude.
import boto3
from botocore.exceptions import ClientError
# Initialize the Bedrock Runtime client
client = boto3.client("bedrock-runtime")
# Set the model ID
model_id = "openai.gpt-oss-20b-1:0"
# Set up messages and system message
messages = [
{
"role": "assistant",
"content": [
{
"text": "Hello! How can I help you today?"
}
]
},
{
"role": "user",
"content": [
{
"text": "What is the weather like today?"
}
]
}
]
system = [
{
"text": "You are a helpful assistant."
}
]
try:
# Send the message to the model, using a basic inference configuration.
response = client.converse(
modelId=model_id,
messages=messages,
system=system,
inferenceConfig={
"maxTokens": 150,
"temperature": 0.7,
"topP": 0.9
},
)
# Extract and print the response text.
for content_block in response["output"]["message"]["content"]:
print(content_block)
except (ClientError, Exception) as e:
print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
exit(1)
在執行模型調用時,透過指定護欄 ID、版本,以及是否在模型調用請求的標頭中啟用護欄追蹤,來套用護欄。
選擇您偏好方法的索引標籤,然後遵循下列步驟:
- Python
-
import boto3
from botocore.exceptions import ClientError
import json
# Initiate the Amazon Bedrock Runtime client
bedrock_runtime = boto3.client("bedrock-runtime")
# Model ID
model_id = "openai.gpt-oss-20b-1:0"
# Replace with actual values from your guardrail
guardrail_id = "GR12345"
guardrail_version = "DRAFT"
# Create the request body
native_request = {
"model": model_id, # You can omit this field
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "assistant",
"content": "Hello! How can I help you today?"
},
{
"role": "user",
"content": "What is the weather like today?"
}
],
"max_completion_tokens": 150,
"temperature": 0.7,
"top_p": 0.9,
"stream": False # You can omit this field
}
try:
response = bedrock_runtime.invoke_model(
modelId=model_id,
body=json.dumps(native_request),
guardrailIdentifier=guardrail_id,
guardrailVersion=guardrail_version,
trace='ENABLED',
)
response_body = json.loads(response.get('body').read())
print("Received response from InvokeModel API (Request Id: {})".format(response['ResponseMetadata']['RequestId']))
print(json.dumps(response_body, indent=2))
except ClientError as err:
print("RequestId = " + err.response['ResponseMetadata']['RequestId'])
raise err
若要查看在OpenAI聊天完成時使用護欄的範例,請選擇您偏好方法的索引標籤,然後遵循以下步驟:
- OpenAI SDK (Python)
-
import openai
from openai import OpenAIError
# Endpoint for Amazon Bedrock Runtime
bedrock_endpoint = "https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1"
# Model ID
model_id = "openai.gpt-oss-20b-1:0"
# Replace with actual values
bedrock_api_key = "$AWS_BEARER_TOKEN_BEDROCK"
guardrail_id = "GR12345"
guardrail_version = "DRAFT"
client = openai.OpenAI(
api_key=bedrock_api_key,
base_url=bedrock_endpoint,
)
try:
response = client.chat.completions.create(
model=model_id,
# Specify guardrail information in the header
extra_headers={
"X-Amzn-Bedrock-GuardrailIdentifier": guardrail_id,
"X-Amzn-Bedrock-GuardrailVersion": guardrail_version,
"X-Amzn-Bedrock-Trace": "ENABLED",
},
# Additional guardrail information can be specified in the body
extra_body={
"amazon-bedrock-guardrailConfig": {
"tagSuffix": "xyz" # Used for input tagging
}
},
messages=[
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "assistant",
"content": "Hello! How can I help you today?"
},
{
"role": "user",
"content": "What is the weather like today?"
}
]
)
request_id = response._request_id
print(f"Request ID: {request_id}")
print(response)
except OpenAIError as e:
print(f"An error occurred: {e}")
if hasattr(e, 'response') and e.response is not None:
request_id = e.response.headers.get("x-request-id")
print(f"Request ID: {request_id}")
- OpenAI SDK (Java)
-
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.core.http.HttpResponseFor;
import com.openai.models.chat.completions.ChatCompletion;
import com.openai.models.chat.completions.ChatCompletionCreateParams;
// Endpoint for Amazon Bedrock Runtime
String bedrockEndpoint = "http://bedrock-runtime.us-west-2.amazonaws.com/openai/v1"
// Model ID
String modelId = "openai.gpt-oss-20b-1:0"
// Replace with actual values
String bedrockApiKey = "$AWS_BEARER_TOKEN_BEDROCK"
String guardrailId = "GR12345"
String guardrailVersion = "DRAFT"
OpenAIClient client = OpenAIOkHttpClient.builder()
.apiKey(bedrockApiKey)
.baseUrl(bedrockEndpoint)
.build()
ChatCompletionCreateParams request = ChatCompletionCreateParams.builder()
.addUserMessage("What is the temperature in Seattle?")
.model(modelId)
// Specify additional headers for the guardrail
.putAdditionalHeader("X-Amzn-Bedrock-GuardrailIdentifier", guardrailId)
.putAdditionalHeader("X-Amzn-Bedrock-GuardrailVersion", guardrailVersion)
// Specify additional body parameters for the guardrail
.putAdditionalBodyProperty(
"amazon-bedrock-guardrailConfig",
JsonValue.from(Map.of("tagSuffix", JsonValue.of("xyz"))) // Allows input tagging
)
.build();
HttpResponseFor<ChatCompletion> rawChatCompletionResponse =
client.chat().completions().withRawResponse().create(request);
final ChatCompletion chatCompletion = rawChatCompletionResponse.parse();
System.out.println(chatCompletion);
批次推論可讓您使用多個提示以非同步方式執行模型推論。若要使用 OpenAI模型執行批次推論,請執行下列動作:
-
建立 JSONL 檔案,並填入至少最小數量的 JSON 物件,每個物件都以新行分隔。每個modelInput
物件必須符合OpenAI建立聊天完成請求內文的格式。以下顯示包含 請求內文的 JSONL 檔案前兩行的範例OpenAI。
{
"recordId": "RECORD1",
"modelInput": {
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Can you generate a question with a factual answer?"
}
],
"max_completion_tokens": 1000
}
}
{
"recordId": "RECORD2",
"modelInput": {
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "What is the weather like today?"
}
],
"max_completion_tokens": 1000
}
}
...
欄位是選用model
的,因為如果您省略批次推論服務, 會根據標頭為您插入它。
檢查您的 JSONL 檔案是否符合 中所述的批次推論配額格式化和上傳批次推論資料。
-
將檔案上傳至 Amazon S3 儲存貯體。
-
使用 S3 儲存貯體的 Amazon Bedrock 控制平面端點傳送 CreateModelInvocationJob 請求,該 S3 儲存貯體來自 inputDataConfig
欄位中指定的步驟,以及 modelId
欄位中指定的OpenAI模型。
如需end-to-end程式碼範例,請參閱 批次推論的程式碼範例。將 取代為OpenAI模型的適當組態。