CohereCommand模型

您可以使用InvokeModel或 InvokeModelWithResponseStream（流式传输）向CohereCommand模型发出推理请求。您需要获得希望使用的模型的模型 ID。要获取模型 ID，请参阅Amazon Bedrock 模型 IDs。

主题

请求和响应
代码示例

请求和响应

Request

CohereCommand模型具有以下推理参数。


{
    "prompt": string,
    "temperature": float,
    "p": float,
    "k": float,
    "max_tokens": int,
    "stop_sequences": [string],
    "return_likelihoods": "GENERATION|ALL|NONE",
    "stream": boolean,
    "num_generations": int,
    "logit_bias": {token_id: bias},
    "truncate": "NONE|START|END"
}

以下是必要参数。

pro mpt —（必填）作为生成响应起点的输入文本。

以下是每次通话的短信数和字符数限制。

以下是可选参数。

return_illich oods — 指定如何以及是否在响应中返回令牌似然性。可以指定以下选项。
- GENERATION – 仅返回生成的令牌的可能性。
- ALL – 返回所有令牌的可能性。
- NONE –（默认）不返回任何可能性。
stream —（需要支持流式传输）指定true实时false返回响应 piece-by-piece ，并在处理完成后返回完整的响应。

logit_bias — 防止模型生成不需要的代币或激励模型包含所需的代币。格式是 {token_id: bias}，其中偏差是介于 -10 和 10 之间的浮点数。可以使用任何标记化服务（例如Cohere的 Tokenize 端点）从文本中获取令牌。有关更多信息，请参阅Cohere文档。

默认	最低	最高
不适用	-10（表示令牌偏差）	10（表示令牌偏差）

num_g enerations — 模型应返回的最大世代数。

默认	最低	最高
1	1	5

truncate — 指定如何API处理长度超过最大令牌长度的输入。使用以下值之一：
- NONE – 当输入超过最大输入令牌长度时，返回错误。
- START — 丢弃输入的开头。
- END –（默认）丢弃输入的结尾。
如果指定 START 或 END，则模型会丢弃输入，直到剩余的输入正好达到模型的最大输入令牌长度。

温度-使用较低的值来降低响应中的随机性。

默认	最低	最高
0.9	0	5

p — Top P。使用较低的值忽略可能性较小的选项。设置为 0 或 1.0 可禁用。如果 p 和 k 同时启用，则 p 在 k 之后执行。

默认	最低	最高
0.75	0	1

k — Top K. 指定模型用于生成下一个令牌的代币选择数。如果 p 和 k 同时启用，则 p 在 k 之后执行。

默认	最低	最高
0	0	500

max_token s — 指定要在生成的响应中使用的最大令牌数。

默认	最低	最高
20	1	4096

stop_seq uences — 最多配置四个模型可以识别的序列。遇到停止序列后，模型将停止生成更多令牌。返回的文本不包含停止序列。

Response

此响应可能包含以下字段：


{
    "generations": [
        {
            "finish_reason": "COMPLETE | MAX_TOKENS | ERROR | ERROR_TOXIC",
            "id": string,
            "text": string,
            "likelihood" : float,
            "token_likelihoods" : [{"token" : float}],
            "is_finished" : true | false,
            "index" : integer
           
        }
    ],
    "id": string,
    "prompt": string
}

generations — 生成的结果和所请求令牌的可能性的列表。（总是返回）。列表中的每个生成对象都包含以下字段。
- id — 生成的标识符。（总是返回）。
- likelihood — 输出的可能性。该字段的值是 token_likelihoods 中词元可能性的平均值。如果指定 return_likelihoods 输入参数，则会返回这个值。
- token_likelihoods — 每个词元可能性的数组。如果指定 return_likelihoods 输入参数，则会返回这个值。
- finish_reason— 模型完成生成代币的原因。 COMPLETE-模特发回了已完成的回复。 MAX_TOKENS— 由于模型达到了其上下文长度的最大标记数，因此回复被切断。 ERROR— 生成回复时出了点问题。 ERROR_TOXIC— 模型生成的回复被认为是有毒的。 finish_reason仅在 is_finished = 时返回true。（并非总是返回）。
- is_finished — 仅在 stream 为 true 时才使用的布尔字段，表示是否还有其他令牌将作为流式传输响应的一部分生成。（并非总是返回）
- text — 生成的文本。
- index — 在流式传输响应中，用于确定给定令牌属于哪个世代。当只流式传输一个响应时，所有令牌都属于同一个世代，并且不会返回索引。因此，index 仅当 num_generations 值大于 1 时才在流式传输请求中返回。
prompt— 来自输入请求的提示（总是返回）。
id — 请求的标识符（总是返回）。

有关更多信息，请参阅Cohere文档中的生成。

代码示例

此示例说明如何调用CohereCommand模型。


# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0
"""
Shows how to generate text using a Cohere model.
"""
import json
import logging
import boto3


from botocore.exceptions import ClientError

logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)


def generate_text(model_id, body):
    """
    Generate text using a Cohere model.
    Args:
        model_id (str): The model ID to use.
        body (str) : The reqest body to use.
    Returns:
        dict: The response from the model.
    """

    logger.info("Generating text with Cohere model %s", model_id)

    accept = 'application/json'
    content_type = 'application/json'

    bedrock = boto3.client(service_name='bedrock-runtime')

    response = bedrock.invoke_model(
        body=body,
        modelId=model_id,
        accept=accept,
        contentType=content_type
    )

    logger.info("Successfully generated text with Cohere model %s", model_id)

    return response


def main():
    """
    Entrypoint for Cohere example.
    """

    logging.basicConfig(level=logging.INFO,
                        format="%(levelname)s: %(message)s")

    model_id = 'cohere.command-text-v14'
    prompt = """Summarize this dialogue: 
"Customer: Please connect me with a support agent.
AI: Hi there, how can I assist you today?
Customer: I forgot my password and lost access to the email affiliated to my account. Can you please help me?
AI: Yes of course. First I'll need to confirm your identity and then I can connect you with one of our support agents.
"""
    try:
        body = json.dumps({
            "prompt": prompt,
            "max_tokens": 200,
            "temperature": 0.6,
            "p": 1,
            "k": 0,
            "num_generations": 2,
            "return_likelihoods": "GENERATION"
        })
        response = generate_text(model_id=model_id,
                                 body=body)

        response_body = json.loads(response.get('body').read())
        generations = response_body.get('generations')

        for index, generation in enumerate(generations):

            print(f"Generation {index + 1}\n------------")
            print(f"Text:\n {generation['text']}\n")
            if 'likelihood' in generation:
                print(f"Likelihood:\n {generation['likelihood']}\n")
            
            print(f"Reason: {generation['finish_reason']}\n\n")

    except ClientError as err:
        message = err.response["Error"]["Message"]
        logger.error("A client error occurred: %s", message)
        print("A client error occured: " +
              format(message))
    else:
        print(f"Finished generating text with Cohere model {model_id}.")


if __name__ == "__main__":
    main()

Javascript 在您的浏览器中被禁用或不可用。

要使用 Amazon Web Services 文档，必须启用 Javascript。请参阅浏览器的帮助页面以了解相关说明。

文档惯例

Cohere模型

CohereEmbed模型