Invoque Meta Llama 2 en Amazon Bedrock mediante la API Invoke Model con un flujo de respuesta

Los siguientes ejemplos de código muestran cómo enviar un mensaje de texto a Meta Llama 2, mediante la API Invoke Model, e imprimir el flujo de respuestas.

Java

SDK para Java 2.x

nota

Hay más información GitHub. Busque el ejemplo completo y aprenda a configurar y ejecutar en el Repositorio de ejemplos de código de AWS.

Envía tu primer mensaje a Meta Llama 3.


// Send a prompt to Meta Llama 2 and print the response stream in real-time.
public class InvokeModelWithResponseStreamQuickstart {

    public static void main(String[] args) {

        // Create a Bedrock Runtime client in the AWS Region of your choice.
        var client = BedrockRuntimeAsyncClient.builder()
                .region(Region.US_WEST_2)
                .build();

        // Set the model ID, e.g., Llama 2 Chat 13B.
        var modelId = "meta.llama2-13b-chat-v1";

        // Define the user message to send.
        var userMessage = "Describe the purpose of a 'hello world' program in one line.";

        // Embed the message in Llama 2's prompt format.
        var prompt = "<s>[INST] " + userMessage + " [/INST]";

        // Create a JSON payload using the model's native structure.
        var request = new JSONObject()
                .put("prompt", prompt)
                // Optional inference parameters:
                .put("max_gen_len", 512)
                .put("temperature", 0.5F)
                .put("top_p", 0.9F);

        // Create a handler to extract and print the response text in real-time.
        var streamHandler = InvokeModelWithResponseStreamResponseHandler.builder()
                .subscriber(event -> event.accept(
                        InvokeModelWithResponseStreamResponseHandler.Visitor.builder()
                                .onChunk(c -> {
                                    var chunk = new JSONObject(c.bytes().asUtf8String());
                                    if (chunk.has("generation")) {
                                        System.out.print(chunk.getString("generation"));
                                    }
                                }).build())
                ).build();

        // Encode and send the request. Let the stream handler process the response.
        client.invokeModelWithResponseStream(req -> req
                .body(SdkBytes.fromUtf8String(request.toString()))
                .modelId(modelId), streamHandler
        ).join();
    }
}
// Learn more about the Llama 2 prompt format at:
// https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-2

Para obtener más información sobre la API, consulte InvokeModelWithResponseStreamla Referencia AWS SDK for Java 2.x de la API.

JavaScript

SDK para JavaScript (v3)

nota

Hay más información. GitHub Busque el ejemplo completo y aprenda a configurar y ejecutar en el Repositorio de ejemplos de código de AWS.

Envía tu primer mensaje a Meta Llama 3.


// Send a prompt to Meta Llama 2 and print the response stream in real-time.

import {
  BedrockRuntimeClient,
  InvokeModelWithResponseStreamCommand,
} from "@aws-sdk/client-bedrock-runtime";

// Create a Bedrock Runtime client in the AWS Region of your choice.
const client = new BedrockRuntimeClient({ region: "us-west-2" });

// Set the model ID, e.g., Llama 2 Chat 13B.
const modelId = "meta.llama2-13b-chat-v1";

// Define the user message to send.
const userMessage =
  "Describe the purpose of a 'hello world' program in one sentence.";

// Embed the message in Llama 2's prompt format.
const prompt = `<s>[INST] ${userMessage} [/INST]`;

// Format the request payload using the model's native structure.
const request = {
  prompt,
  // Optional inference parameters:
  max_gen_len: 512,
  temperature: 0.5,
  top_p: 0.9,
};

// Encode and send the request.
const responseStream = await client.send(
  new InvokeModelWithResponseStreamCommand({
    contentType: "application/json",
    body: JSON.stringify(request),
    modelId,
  }),
);

// Extract and print the response stream in real-time.
for await (const event of responseStream.body) {
  /** @type {{ generation: string }} */
  const chunk = JSON.parse(new TextDecoder().decode(event.chunk.bytes));
  if (chunk.generation) {
    process.stdout.write(chunk.generation);
  }
}

// Learn more about the Llama 3 prompt format at:
// https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/#special-tokens-used-with-meta-llama-3

Para obtener más información sobre la API, consulte InvokeModelWithResponseStreamla Referencia AWS SDK for JavaScript de la API.

Python

SDK para Python (Boto3)

nota

Hay más información al respecto GitHub. Busque el ejemplo completo y aprenda a configurar y ejecutar en el Repositorio de ejemplos de código de AWS.

Usa la API Invoke Model para enviar un mensaje de texto e imprimir el flujo de respuestas.


# Use the native inference API to send a text message to Meta Llama 2
# and print the response stream.

import boto3
import json

# Create a Bedrock Runtime client in the AWS Region of your choice.
client = boto3.client("bedrock-runtime", region_name="us-east-1")

# Set the model ID, e.g., Llama 2 Chat 13B.
model_id = "meta.llama2-13b-chat-v1"

# Define the message to send.
user_message = "Describe the purpose of a 'hello world' program in one line."

# Embed the message in Llama 2's prompt format.
prompt = f"<s>[INST] {user_message} [/INST]"

# Format the request payload using the model's native structure.
native_request = {
    "prompt": prompt,
    "max_gen_len": 512,
    "temperature": 0.5,
}

# Convert the native request to JSON.
request = json.dumps(native_request)

# Invoke the model with the request.
streaming_response = client.invoke_model_with_response_stream(
    modelId=model_id, body=request
)

# Extract and print the response text in real-time.
for event in streaming_response["body"]:
    chunk = json.loads(event["chunk"]["bytes"])
    if "generation" in chunk:
        print(chunk["generation"], end="")

Para obtener más información sobre la API, consulta InvokeModelWithResponseStreamla AWS Referencia de API de SDK for Python (Boto3).

Para obtener una lista completa de guías para desarrolladores del AWS SDK y ejemplos de código, consulte. Uso de este servicio con un AWS SDK En este tema también se incluye información sobre cómo comenzar a utilizar el SDK y detalles sobre sus versiones anteriores.

Aviso JavaScript está desactivado o no está disponible en su navegador.

Para utilizar la documentación de AWS, debe estar habilitado JavaScript. Para obtener más información, consulte las páginas de ayuda de su navegador.

Convenciones del documento

Llama 2: Genera texto

Llama 3: Genera texto