Learn about use cases for different model inference methods

You can directly run model inference in the following ways:

Method	Use case
Amazon Bedrock console playgrounds	Run inference in a user-friendly graphical interface. Convenient for exploration.
Converse or ConverseStream	Implement conversational applications with a unified API for model input.
InvokeModel or InvokeModelWithResponseStream	Submit a single prompt and generate a response synchronously. Useful for generating responses in real time or for search queries.
StartAsyncInvoke	Submit a single prompt and generate a response asynchronously. Useful for generating responses at a large scale.
CreateModelInvocationJob	Prepare a dataset of prompts and generate responses in batches.

The following Amazon Bedrock features also use model inference as a step in a larger workflow:

Model evaluation uses the model invocation process to evaluate the performance of different models after you submit a CreateEvaluationJob request.
Knowledge bases use model invocation when using the RetrieveAndGenerate API to generate a response based on results retrieved from a knowledge base.
Agents use model invocation to generate responses in various stages during an InvokeAgent request.
Flows include Amazon Bedrock resources, such as prompts, knowledge bases, and agents, which use model invocation.

After testing out different foundation models with different prompts and inference parameters, you can configure your application to call these APIs with your desired specifications.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Inference: Generate responses

How inference works