ListInferenceProfiles
Returns a list of inference profiles that you can use. For more information, see Increase throughput and resilience with cross-region inference in Amazon Bedrock. in the Amazon Bedrock User Guide.
Request Syntax
GET /inference-profiles?maxResults=maxResults
&nextToken=nextToken
&type=typeEquals
HTTP/1.1
URI Request Parameters
The request uses the following URI parameters.
- maxResults
-
The maximum number of results to return in the response. If the total number of results is greater than this value, use the token returned in the response in the
nextToken
field when making another request to return the next batch of results.Valid Range: Minimum value of 1. Maximum value of 1000.
- nextToken
-
If the total number of results is greater than the
maxResults
value provided in the request, enter the token returned in thenextToken
field in the response in this field to return the next batch of results.Length Constraints: Minimum length of 1. Maximum length of 2048.
Pattern:
^\S*$
- typeEquals
-
Filters for inference profiles that match the type you specify.
-
SYSTEM_DEFINED
– The inference profile is defined by Amazon Bedrock. You can route inference requests across regions with these inference profiles. -
APPLICATION
– The inference profile was created by a user. This type of inference profile can track metrics and costs when invoking the model in it. The inference profile may route requests to one or multiple regions.
Valid Values:
SYSTEM_DEFINED | APPLICATION
-
Request Body
The request does not have a request body.
Response Syntax
HTTP/1.1 200
Content-type: application/json
{
"inferenceProfileSummaries": [
{
"createdAt": "string",
"description": "string",
"inferenceProfileArn": "string",
"inferenceProfileId": "string",
"inferenceProfileName": "string",
"models": [
{
"modelArn": "string"
}
],
"status": "string",
"type": "string",
"updatedAt": "string"
}
],
"nextToken": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- inferenceProfileSummaries
-
A list of information about each inference profile that you can use.
Type: Array of InferenceProfileSummary objects
- nextToken
-
If the total number of results is greater than the
maxResults
value provided in the request, use this token when making another request in thenextToken
field to return the next batch of results.Type: String
Length Constraints: Minimum length of 1. Maximum length of 2048.
Pattern:
^\S*$
Errors
For information about the errors that are common to all actions, see Common Errors.
- AccessDeniedException
-
The request is denied because of missing access permissions.
HTTP Status Code: 403
- InternalServerException
-
An internal server error occurred. Retry your request.
HTTP Status Code: 500
- ThrottlingException
-
The number of requests exceeds the limit. Resubmit your request later.
HTTP Status Code: 429
- ValidationException
-
Input validation failed. Check your request parameters and retry the request.
HTTP Status Code: 400
Examples
List information about inference profiles in your Region
Run the following example to list information for up to 5 inference profiles in your region:
Sample Request
GET /inference-profiles?maxResults=5 HTTP/1.1
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: