Stability.ai Stable Diffusion 3
The Stable Diffusion 3 models and Stable Image Core model have the following inference parameters and
model responses for making inference calls.
Stable Diffusion 3 Large request and response
The request body is passed in the body
field of a request to
InvokeModel or InvokeModelWithResponseStream.
Model invocation request body field
When you make an InvokeModel call using a Stable Diffusion 3 Large model, fill the
body field with a JSON object that looks like the below.
{
'prompt': 'Create an image of a panda'
}
Model invocation responses body field
When you make an InvokeModel
call using a Stable Diffusion 3 Large model, the response looks like the below
{
'seeds': [2130420379],
"finish_reasons": [null],
"images": ["..."]
}
A response with a finish reason that is not null
, will look like the below:
{
"finish_reasons": ["Filter reason: prompt"]
}
seeds – (string) List of seeds used to
generate images for the model.
-
finish_reasons – Enum indicating whether the
request was filtered or not. null
will indicate that the request was successful. Current possible values: "Filter reason: prompt", "Filter reason: output image", "Filter reason: input image", "Inference error", null
.
-
images – A list of generated images in base64 string format.
For more information,
see https://platform.stability.ai/docs/api-reference#tag/v1generation.
- Text to image
-
The Stability.ai Stable Diffusion 3 Large model has the following
inference parameters for a text-to-image inference call.
-
prompt – (string)
What you wish to see in the output image. A strong, descriptive prompt
that clearly defines elements, colors, and subjects will lead to better
results.
Optional fields
aspect_ratio – (string) Controls the aspect ratio of the
generated image. This parameter is only valid for text-to-image requests. Default 1:1. Enum: 16:9, 1:1, 21:9, 2:3, 3:2, 4:5, 5:4, 9:16, 9:21.
-
mode – Controls whether this is a text-to-image or image-to-image generation, which
affects which parameters are required. Default: text-to-image. Enum: image-to-image
, text-to-image
.
-
output_format – Specifies the format of the output image. Supported formats: JPEG, PNG. Supported
dimensions: height 640 to 1,536 px, width 640 to 1,536 px.
-
seed – (number) A specific value that is used to guide the 'randomness' of the generation. (Omit this parameter or pass 0 to use a random seed.) Range: 0 to 4294967295.
-
negative_prompt – Keywords of what you do not wish to see in the output image. Max: 10.000 characters.
import boto3
import json
import base64
import io
from PIL import Image
bedrock = boto3.client('bedrock-runtime', region_name='us-west-2')
response = bedrock.invoke_model(
modelId='stability.sd3-large-v1:0',
body=json.dumps({
'prompt': 'A car made out of vegetables.'
})
)
output_body = json.loads(response["body"].read().decode("utf-8"))
base64_output_image = output_body["images"][0]
image_data = base64.b64decode(base64_output_image)
image = Image.open(io.BytesIO(image_data))
image.save("image.png")
- Image to image
-
The Stability.ai Stable Diffusion 3 Large model has the following
inference parameters for a image-to-image inference call.
text_prompts (Required)
– An array of text prompts to use for
generation. Each element is a JSON object that contains
a prompt and a weight for the prompt.
-
prompt – (string)
What you wish to see in the output image. A strong, descriptive prompt
that clearly defines elements, colors, and subjects will lead to better
results.
-
image – String in base64 format.
The image to use as the starting point for the generation. Supported
formats: JPEG, PNG, WEBP (WEBP not supported in console), Supported
dimensions: Width: 640 - 1536 px, Height: 640 - 1536 px.
-
strength – Numerical. Sometimes referred
to as denoising, this parameter controls how much influence the image
parameter has on the generated image. A value of 0 would yield an image
that is identical to the input. A value of 1 would be as if you passed
in no image at all. Range: [0, 1]
-
mode – must be set to image-to-image
.
Optional fields
aspect_ratio – (string) Controls the aspect ratio of the
generated image. This parameter is only valid for text-to-image requests. Default 1:1. Enum: 16:9, 1:1, 21:9, 2:3, 3:2, 4:5, 5:4, 9:16, 9:21.
-
mode – Controls whether this is a text-to-image or image-to-image generation, which
affects which parameters are required. Default: text-to-image. Enum: image-to-image
, text-to-image
.
-
output_format – Specifies the format of the output image. Supported formats: JPEG, PNG. Supported
dimensions: height 640 to 1,536 px, width 640 to 1,536 px.
-
seed – (number) A specific value that is used to guide the 'randomness' of the generation. (Omit this parameter or pass 0 to use a random seed.) Range: 0 to 4294967295.
-
negative_prompt – Keywords of what you do not wish to see in the output image. Max: 10.000 characters.
import boto3
import json
import base64
import io
from PIL import Image
bedrock = boto3.client('bedrock-runtime', region_name='us-west-2')
file_path = 'input_image.png'
image_bytes = open(file_path, "rb").read()
base64_image = base64.b64encode(image_bytes).decode("utf-8")
response = bedrock.invoke_model(
modelId='stability.sd3-large-v1:0',
body=json.dumps({
'prompt': 'A car made out of fruits',
'image': base64_image,
'strength': 0.75,
'mode': 'image-to-image'
})
)
output_body = json.loads(response["body"].read().decode("utf-8"))
base64_output_image = output_body["images"][0]
image_data = base64.b64decode(base64_output_image)
image = Image.open(io.BytesIO(image_data))
image.save("output_image.png")