Request Syntax URI Request Parameters Request Body Response Syntax Response Elements Errors Examples See Also

ListInferenceProfiles

Returns a list of inference profiles that you can use. For more information, see Increase throughput and resilience with cross-region inference in Amazon Bedrock. in the Amazon Bedrock User Guide.

Request Syntax


GET /inference-profiles?maxResults=maxResults&nextToken=nextToken&type=typeEquals HTTP/1.1

URI Request Parameters

The request uses the following URI parameters.

maxResults

The maximum number of results to return in the response. If the total number of results is greater than this value, use the token returned in the response in the nextToken field when making another request to return the next batch of results.

Valid Range: Minimum value of 1. Maximum value of 1000.

nextToken

If the total number of results is greater than the maxResults value provided in the request, enter the token returned in the nextToken field in the response in this field to return the next batch of results.

Length Constraints: Minimum length of 1. Maximum length of 2048.

Pattern: \S*

typeEquals

Filters for inference profiles that match the type you specify.

SYSTEM_DEFINED – The inference profile is defined by Amazon Bedrock. You can route inference requests across regions with these inference profiles.
APPLICATION – The inference profile was created by a user. This type of inference profile can track metrics and costs when invoking the model in it. The inference profile may route requests to one or multiple regions.

Valid Values: SYSTEM_DEFINED | APPLICATION

Request Body

The request does not have a request body.

Response Syntax


HTTP/1.1 200
Content-type: application/json

{
   "inferenceProfileSummaries": [ 
      { 
         "createdAt": "string",
         "description": "string",
         "inferenceProfileArn": "string",
         "inferenceProfileId": "string",
         "inferenceProfileName": "string",
         "models": [ 
            { 
               "modelArn": "string"
            }
         ],
         "status": "string",
         "type": "string",
         "updatedAt": "string"
      }
   ],
   "nextToken": "string"
}

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

inferenceProfileSummaries

A list of information about each inference profile that you can use.

Type: Array of InferenceProfileSummary objects

nextToken

If the total number of results is greater than the maxResults value provided in the request, use this token when making another request in the nextToken field to return the next batch of results.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 2048.

Pattern: \S*

Errors

For information about the errors that are common to all actions, see Common Errors.

AccessDeniedException

The request is denied because of missing access permissions.

HTTP Status Code: 403

InternalServerException

An internal server error occurred. Retry your request.

HTTP Status Code: 500

ThrottlingException

The number of requests exceeds the limit. Resubmit your request later.

HTTP Status Code: 429

ValidationException

Input validation failed. Check your request parameters and retry the request.

HTTP Status Code: 400

Examples

List information about inference profiles in your Region

Run the following example to list information for up to 5 inference profiles in your region:

Sample Request


GET /inference-profiles?maxResults=5 HTTP/1.1