本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。
匯入模型的進階 API 功能
此頁面提供 2025 年 11 月 11 日之後匯入的模型可用的進階功能詳細範例。這些功能包括用於控制生成的結構化輸出、用於多影像處理的增強視覺支援、用於可信度洞察的日誌機率,以及用於GPT-OSS模型的工具呼叫。
結構化輸出
結構化輸出可依照特定格式、結構描述或模式進行控制產生。此功能可確保模型的回應遵守預先定義的限制,因此非常適合需要一致資料格式、API 整合或自動化處理管道的應用程式。
透過兩個參數支援自訂模型匯入上的結構化輸出:
在自訂模型匯入上使用結構化輸出時,客戶應該因為產生期間的限制驗證而預期效能取捨。choice 和 等簡單限制json_object條件的影響最小,而 json_schema和 等複雜限制條件grammar可大幅增加延遲並降低輸送量。為了獲得最佳效能,請盡可能使用更簡單的限制條件類型,並保持結構描述平坦,而不是深度巢狀。
下列範例示範不同 API 格式的結構化輸出支援。Pydantic 模型定義為:
from pydantic import BaseModel
from enum import Enum
class CarType(str, Enum):
sedan = "sedan"
suv = "SUV"
truck = "Truck"
coupe = "Coupe"
class CarDescription(BaseModel):
brand: str
model: str
car_type: CarType
- BedrockCompletion
-
BedrockCompletion 僅支援搭配 json_object和 json_schema類型使用 response_format 參數的結構化輸出。
範例:JSON 結構描述
payload = {
"prompt": "Generate a JSON with the brand, model and car_type of the most iconic car from the 90's",
"response_format": {
"type": "json_schema",
"json_schema": CarDescription.model_json_schema()
}
}
response = client.invoke_model(
modelId='your-model-arn',
body=json.dumps(payload),
accept='application/json',
contentType='application/json'
)
response_body = json.loads(response['body'].read())
回應範例:
{
"generation": "{\n \"brand\": \"Ferrari\",\n \"model\": \"F40\",\n \"car_type\": \"SUV\"\n }",
"prompt_token_count": 22,
"generation_token_count": 30,
"stop_reason": "stop",
"logprobs": null
}
- OpenAICompletion
-
OpenAICompletion 同時支援 response_format(json_object, json_schema) 和 structured_outputs(json, regex, choice, grammar) 參數。使用 max_tokens而非 max_gen_len將請求路由到 OpenAICompletion。
範例:結構化輸出 - 選擇
payload = {
"prompt": "Classify the sentiment of this sentence. Amazon Bedrock CMI is Amazing!",
"max_tokens": 10,
"structured_outputs": {
"choice": ["positive", "negative"]
}
}
response = client.invoke_model(
modelId='your-model-arn',
body=json.dumps(payload),
accept='application/json',
contentType='application/json'
)
response_body = json.loads(response['body'].read())
回應範例:
{
"id": "cmpl-01f94c4652d24870bbb4d5418a01c384",
"object": "text_completion",
"choices": [
{
"index": 0,
"text": "positive",
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 17,
"completion_tokens": 4
}
}
- OpenAIChatCompletion
-
OpenAIChatCompletion 同時支援 response_format(json_object、json_schema) 和 structured_outputs(json、regex、 choice、grammar) 參數。
範例:回應格式 - JSON 結構描述
payload = {
"messages": [
{"role": "user", "content": "Generate a JSON with the brand, model and car_type of the most iconic car from the 90's"}
],
"max_tokens": 100,
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "car-description",
"schema": CarDescription.model_json_schema()
}
}
}
response = client.invoke_model(
modelId='your-model-arn',
body=json.dumps(payload),
accept='application/json',
contentType='application/json'
)
response_body = json.loads(response['body'].read())
回應範例:
{
"id": "chatcmpl-cae5a43b0a924b8eb434510cbf978a19",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "{\"brand\": \"Dodge\", \"model\": \"Viper\", \"car_type\": \"Coupe\"}"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 56,
"completion_tokens": 23
}
}
視覺支援
視覺功能可搭配文字輸入處理影像,並增強對複雜視覺分析任務的多影像支援。自訂模型匯入現在支援每個請求最多 3 個映像,從先前的單一映像限制增強。
支援的 API:僅限 OpenAIChatCompletion。2025 年 11 月 11 日之後匯入的所有模型都會預設為此 API 提供視覺功能。
影像需求:
高解析度映像會大幅增加處理時間、記憶體用量,並可能導致逾時錯誤。多個高解析度影像複合效能會以指數方式影響。為了獲得最佳效能,請適當調整影像大小,並盡可能使用較低的詳細資訊層級。
- OpenAIChatCompletion
-
範例:多映像處理
import json
import boto3
import base64
client = boto3.client('bedrock-runtime', region_name='us-east-1')
# Load and encode images
with open('/path/to/car_image_1.jpg', 'rb') as f:
image_data_1 = base64.b64encode(f.read()).decode('utf-8')
with open('/path/to/car_image_2.jpg', 'rb') as f:
image_data_2 = base64.b64encode(f.read()).decode('utf-8')
payload = {
"messages": [
{
"role": "system",
"content": "You are a helpful assistant that can analyze images."
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "Spot the difference between the two images?"
},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{image_data_1}"
}
},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{image_data_2}"
}
}
]
}
],
"max_tokens": 300,
"temperature": 0.5
}
response = client.invoke_model(
modelId='your-model-arn',
body=json.dumps(payload),
accept='application/json',
contentType='application/json'
)
response_body = json.loads(response['body'].read())
回應範例:
{
"id": "chatcmpl-ccae8a67e62f4014a9ffcbedfff96f44",
"object": "chat.completion",
"created": 1763167018,
"model": "667387627229-g6vkuhd609s4",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "There are no differences between the two images provided. They appear to be identical.",
"refusal": null,
"annotations": null,
"audio": null,
"function_call": null,
"tool_calls": [],
"reasoning_content": null
},
"logprobs": null,
"finish_reason": "stop",
"stop_reason": null,
"token_ids": null
}
],
"service_tier": null,
"system_fingerprint": null,
"usage": {
"prompt_tokens": 2795,
"total_tokens": 2812,
"completion_tokens": 17,
"prompt_tokens_details": null
},
"prompt_logprobs": null,
"prompt_token_ids": null,
"kv_transfer_params": null
}
日誌機率
日誌機率代表序列中每個字符的可能性,計算方式為 log(p),其中 p 是根據內容中先前的字符,在任何位置的字符機率。由於日誌 prob 是累加的,序列機率等於個別字符日誌 prob 的總和,因此對於按平均每個字符分數對生成進行排名非常有用。自訂模型匯入一律會傳回所請求權杖的原始 logprob 值。
關鍵應用程式包括分類任務,其中日誌觀察程式可啟用自訂可信度閾值、擷取使用可信度分數來減少幻覺的問答系統、根據字符可能性自動完成建議,以及用於跨提示比較模型效能的複雜度計算。日誌 prob 也提供權杖層級分析功能,可讓開發人員檢查模型考慮的替代權杖。
Logprob 不會快取。對於需要提示 logprob 的請求,系統會忽略字首快取,並重新計算預先填入的完整提示以產生 logprob。這在使用 logprob 時呈現明顯的效能權衡。
日誌機率支援因 API 格式而異:
BedrockCompletion - 僅限輸出字符
OpenAICompletion - 提示和輸出字符
OpenAIChatCompletion - 提示和輸出字符
- BedrockCompletion
-
BedrockCompletion 僅支援輸出字符 logprob。這將傳回每個輸出字符的前 1 個 logprob。
payload = {
"prompt": "How is the rainbow formed?",
"max_gen_len": 10,
"temperature": 0.5,
"return_logprobs": True
}
response = client.invoke_model(
modelId='your-model-arn',
body=json.dumps(payload),
accept='application/json',
contentType='application/json'
)
response_body = json.loads(response['body'].read())
回應範例 (截斷):
{
"generation": " A rainbow is formed when sunlight passes through water dro",
"prompt_token_count": 7,
"generation_token_count": 10,
"stop_reason": "length",
"logprobs": [
{
"362": -2.1413702964782715
},
{
"48713": -0.8180374503135681
},
{
"374": -0.09657637774944305
},
...
]
}
- OpenAIChatCompletion
-
OpenAIChatCompletion 支援提示和輸出字符 logprob。您可以設定 top_logprobs=N,prompt_logprobs=N其中 N 是整數,代表每個位置 N 最可能字符的日誌機率。
payload = {
"messages": [
{
"role": "user",
"content": "How is the rainbow formed?"
}
],
"max_tokens": 10,
"temperature": 0.5,
"logprobs": True,
"top_logprobs": 1,
"prompt_logprobs": 1
}
response = client.invoke_model(
modelId='your-model-arn',
body=json.dumps(payload),
accept='application/json',
contentType='application/json'
)
response_body = json.loads(response['body'].read())
回應範例 (截斷):
{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "A rainbow is formed..."
},
"logprobs": {
"content": [
{
"token": "A",
"logprob": -0.07903262227773666,
"bytes": [65],
"top_logprobs": [
{
"token": "A",
"logprob": -0.07903262227773666,
"bytes": [65]
}
]
},
{
"token": " rainbow",
"logprob": -0.20187227427959442,
"bytes": [32, 114, 97, 105, 110, 98, 111, 119],
"top_logprobs": [...]
},
...
]
},
"finish_reason": "length"
}
],
"usage": {
"prompt_tokens": 41,
"completion_tokens": 10,
"total_tokens": 51
}
}