DeepSeek モデル

フォーカスモード

DeepSeek モデル - Amazon Bedrock

DeepSeekの R1 モデルは、Invoke API (InvokeModel、InvokeModelWithResponseStream) および Converse API (Converse および ConverseStream) を介した推論に使用できるtext-to-textモデルです。

DeepSeekのモデルで推論呼び出しを行う場合は、モデルのプロンプトを含める必要があります。Amazon Bedrock がサポートするDeepSeekモデルのプロンプトの作成に関する一般的な情報については、DeepSeek「プロンプトガイド」を参照してください。

注記

Amazon Titan、Amazon Nova、DeepSeek-R1、Mistral AI、Meta Llama 3 Instruct モデルからリクエストアクセスを削除することはできません。IAM ポリシーを使用し、モデル ID を指定することで、ユーザーがこれらのモデルを推論呼び出しできないようにすることができます。詳細については、「基盤モデルの推論のためのアクセス拒否」を参照してください。

このセクションでは、DeepSeek モデルのリクエストパラメータとレスポンスフィールドについて説明します。この情報を使用して、InvokeModel オペレーションでDeepSeekモデルを推論呼び出します。このセクションでは、DeepSeekモデルを呼び出す方法を示す Python コード例も含まれています。

推論オペレーションでモデルを使用するには、そのモデルのモデル ID が必要です。このモデルはクロスリージョン推論によって呼び出されるため、推論プロファイル ID をモデル ID として使用する必要があります。たとえば、米国の場合は、を使用しますus.deepseek.r1-v1:0。

モデル名: DeepSeek-R1
テキストモデル

API でDeepSeekモデルを使用する方法の詳細については、DeepSeek「モデル」を参照してください。 APIs

DeepSeek リクエストとレスポンス

リクエストボディ

DeepSeek には、テキスト完了推論呼び出しの次の推論パラメータがあります。


{
    "prompt": string,
    "temperature": float, 
    "top_p": float,
    "max_tokens": int,
    "stop": string array
}

[フィールド]

prompt – (文字列) プロンプトの必須テキスト入力。
temperature – (浮動小数点) 1 以下の数値。
top_p – (浮動小数点) 1 以下の数値。
max_tokens – (int) 使用するトークン。最小 1～最大 32,768 トークン。
stop – (文字列配列) 最大 10 項目。

レスポンス本文

DeepSeek には、テキスト完了推論呼び出しの次のレスポンスパラメータがあります。この例では、からのテキスト補完でありDeepSeek、コンテンツ推論ブロックを返しません。


{
    "choices": [
        {
            "text": string,
            "stop_reason": string
        }
    ]
}

[フィールド]

stop_reason – (文字列) レスポンスがテキストの生成を停止した理由。stop またはの値length。
stop – (文字列) モデルは入力プロンプトのテキストの生成を終了しました。
length – (文字列) 生成されたテキストのトークンの長さが、 InvokeModel (出力をストリーミングしている場合は InvokeModelWithResponseStreamまたは ) への呼び出しmax_tokensのの値を超えています。レスポンスはに切り捨てられますmax_tokens。の値を増やmax_tokensして、リクエストを再試行してください。

コード例

この例では、モデルを呼び出す方法を示します。


# Use the API to send a text message to DeepSeek-R1.

import boto3
import json

from botocore.exceptions import ClientError

# Create a Bedrock Runtime client in the AWS リージョン of your choice.
client = boto3.client("bedrock-runtime", region_name="us-west-2")

# Set the cross Region inference profile ID for DeepSeek-R1
model_id = "us.deepseek.r1-v1:0"

# Define the prompt for the model.
prompt = "Describe the purpose of a 'hello world' program in one line."

# Embed the prompt in DeepSeek-R1's instruction format.
formatted_prompt = f"""
<｜begin▁of▁sentence｜><｜User｜>{prompt}<｜Assistant｜><think>\n
"""

body = json.dumps({
    "prompt": formatted_prompt,
    "max_tokens": 512,
    "temperature": 0.5,
    "top_p": 0.9,
})

try:
    # Invoke the model with the request.
    response = client.invoke_model(modelId=model_id, body=body)

    # Read the response body.
    model_response = json.loads(response["body"].read())
    
    # Extract choices.
    choices = model_response["choices"]
    
    # Print choices.
    for index, choice in enumerate(choices):
        print(f"Choice {index + 1}\n----------")
        print(f"Text:\n{choice['text']}\n")
        print(f"Stop reason: {choice['stop_reason']}\n")
except (ClientError, Exception) as e:
    print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
    exit(1)

Converse

リクエスト本文 - このリクエスト本文の例を使用して ConverseAPI を呼び出します。


{
    "modelId": string, # us.deepseek.r1-v1:0
    "system": [
        {
            "text": string
        }
    ],
    "messages": [
        {
            "role": string,
            "content": [
                {
                    "text": string
                }
            ]
        }
    ],
    "inferenceConfig": {
        "temperature": float,
        "topP": float,
        "maxTokens": int,
        "stopSequences": string array
    },
    "guardrailConfig": { 
        "guardrailIdentifier":"string",
        "guardrailVersion": "string",
        "trace": "string"
    }
}

[フィールド]

system – (オプション) リクエストのシステムプロンプト。
messages – (必須) 入力メッセージ。
- role – 会話ターンのロール。有効な値は、user および assistant です。
- content – (必須) オブジェクトの配列としての会話ターンの内容。各オブジェクトにはタイプフィールドが含まれており、次のいずれかの値を指定できます。
  - text – (必須) このタイプを指定する場合は、テキストフィールドを含め、テキストプロンプトを値として指定する必要があります。
inferenceConfig
- temperature – (オプション) 値: 最小 = 0. 最大 = 1。
- topP – (オプション) 値: 最小 = 0. 最大 = 1。
- maxTokens – (オプション) 停止する前に生成するトークンの最大数。値: 最小 = 0、最大 = 32,768。
- stopSequences – (オプション) モデルが出力の生成を停止するカスタムテキストシーケンス。最大 = 10 項目。

レスポンス本文 - このリクエスト本文の例を使用して ConverseAPI を呼び出します。


{
    "message": {
        "role" : "assistant",
        "content": [
            {
                "text": string
            },
            {
                "reasoningContent": {
                    "reasoningText": string
                }
            }
        ],
    },
    "stopReason": string,
    "usage": {
        "inputTokens": int,
        "outputTokens": int,
        "totalTokens": int
    }
    "metrics": {
        "latencyMs": int
    }
}

[フィールド]

message – モデルからのレスポンスを返します。
role – 生成されたメッセージの会話ロール。値は常に assistant です。
content – モデルによって生成されたコンテンツ。配列として返されます。コンテンツには 2 つのタイプがあります。
- text – レスポンスのテキストコンテンツ。
- reasoningContent – (オプション) モデルレスポンスからの推論コンテンツ。
  - reasoningText – モデルレスポンスからの推論テキスト。
stopReason – モデルがレスポンスの生成を停止した理由。
- end_turn – モデルが停止ポイントに達したターン。
- max_tokens – 生成されたテキストがmaxTokens入力フィールドの値を超えたか、モデルがサポートするトークンの最大数を超えました。

サンプルコード - DeepSeek が ConverseAPI を呼び出すためにを作成する例を次に示します。


# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0
"""
Shows how to use the Converse API with DeepSeek-R1 (on demand).
"""

import logging
import boto3

from botocore.client import Config
from botocore.exceptions import ClientError


logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)


def generate_conversation(bedrock_client,
                          model_id,
                          system_prompts,
                          messages):
    """
    Sends messages to a model.
    Args:
        bedrock_client: The Boto3 Bedrock runtime client.
        model_id (str): The model ID to use.
        system_prompts (JSON) : The system prompts for the model to use.
        messages (JSON) : The messages to send to the model.

    Returns:
        response (JSON): The conversation that the model generated.

    """

    logger.info("Generating message with model %s", model_id)

    # Inference parameters to use.
    temperature = 0.5
    max_tokens = 4096

    # Base inference parameters to use.
    inference_config = {
        "temperature": temperature,
        "maxTokens": max_tokens,
    }

    # Send the message.
    response = bedrock_client.converse(
        modelId=model_id,
        messages=messages,
        system=system_prompts,
        inferenceConfig=inference_config,
    )

    # Log token usage.
    token_usage = response['usage']
    logger.info("Input tokens: %s", token_usage['inputTokens'])
    logger.info("Output tokens: %s", token_usage['outputTokens'])
    logger.info("Total tokens: %s", token_usage['totalTokens'])
    logger.info("Stop reason: %s", response['stopReason'])

    return response

def main():
    """
    Entrypoint for DeepSeek-R1 example.
    """

    logging.basicConfig(level=logging.INFO,
                        format="%(levelname)s: %(message)s")

    model_id = "us.deepseek.r1-v1:0"

    # Setup the system prompts and messages to send to the model.
    system_prompts = [{"text": "You are an app that creates playlists for a radio station that plays rock and pop music. Only return song names and the artist."}]
    message_1 = {
        "role": "user",
        "content": [{"text": "Create a list of 3 pop songs."}]
    }
    message_2 = {
        "role": "user",
        "content": [{"text": "Make sure the songs are by artists from the United Kingdom."}]
    }
    messages = []

    try:
        # Configure timeout for long responses if needed
        custom_config = Config(connect_timeout=840, read_timeout=840)
        bedrock_client = boto3.client(service_name='bedrock-runtime', config=custom_config)

        # Start the conversation with the 1st message.
        messages.append(message_1)
        response = generate_conversation(
            bedrock_client, model_id, system_prompts, messages)

        # Add the response message to the conversation.
        output_message = response['output']['message']
        
        # Remove reasoning content from the response
        output_contents = []
        for content in output_message["content"]:
            if content.get("reasoningContent"):
                continue
            else:
                output_contents.append(content)
        output_message["content"] = output_contents
        
        messages.append(output_message)

        # Continue the conversation with the 2nd message.
        messages.append(message_2)
        response = generate_conversation(
            bedrock_client, model_id, system_prompts, messages)

        output_message = response['output']['message']
        messages.append(output_message)

        # Show the complete conversation.
        for message in messages:
            print(f"Role: {message['role']}")
            for content in message['content']:
                if content.get("text"):
                    print(f"Text: {content['text']}")
                if content.get("reasoningContent"):
                    reasoning_content = content['reasoningContent']
                    reasoning_text = reasoning_content.get('reasoningText', {})
                    print()
                    print(f"Reasoning Text: {reasoning_text.get('text')}")
            print()

    except ClientError as err:
        message = err.response['Error']['Message']
        logger.error("A client error occurred: %s", message)
        print(f"A client error occured: {message}")

    else:
        print(
            f"Finished generating text with model {model_id}.")


if __name__ == "__main__":
    main()

ブラウザで JavaScript が無効になっているか、使用できません。

AWS ドキュメントを使用するには、JavaScript を有効にする必要があります。手順については、使用するブラウザのヘルプページを参照してください。

ドキュメントの表記規則

Cohere Command R および Command R+ モデル

AI21 Labs モデル

次のトピック

AI21 Labs モデル

前のトピック:

Cohere Command R および Command R+ モデル

ヘルプが必要ですか?

Cookie の設定を選択する

Cookie の設定をカスタマイズする

Essential

Performance

Functional

Advertising

Cookie の設定を保存できません

DeepSeek モデル

注記

次のトピック

前のトピック:

ヘルプが必要ですか?

Related resources

このページは役に立ちましたか?

Related resources