Request and response structure for image generation
The following examples present different image generation use cases. Each example provides an explanation of the fields that are used for the image generation.
- Text-to-image request
-
{ "taskType": "TEXT_IMAGE", "textToImageParams": { "text":
string
, "negativeText":string
}, "imageGenerationConfig": { "width":int
, "height":int
, "quality": "standard" | "premium", "cfgScale":float
, "seed":int
, "numberOfImages":int
} }The following
textToImageParams
fields are used in this request:-
text
(Required) – A text prompt to generate the image. The prompt must be 1-1024 characters in length. -
negativeText
(Optional) – A text prompt to define what not to include in the image. This value must be 1-1024 characters in length.
Note
Avoid using negating words (“no”, “not”, “without”, etc.) in your
text
andnegativeText
values. For example, if you do not want mirrors in an image, instead of including "no mirrors" or "without mirrors" in thetext
field, use the word "mirrors" in thenegativeText
field. -
- Text-to-image request with image conditioning
-
{ "taskType": "TEXT_IMAGE", "textToImageParams": { "conditionImage":
string (Base64 encoded image)
, "controlMode": "CANNY_EDGE" | "SEGMENTATION", "controlStrength":float
, "text":string
, "negativeText":string
}, "imageGenerationConfig": { "width":int
, "height":int
, "quality": "standard" | "premium", "cfgScale":float
, "seed":int
, "numberOfImages":int
} }The following
textToImageParams
fields are used in this request:-
conditionImage
(Required) – A JPEG or PNG image that guides the layout and composition of the generated image. The image must be formatted as a Base64 string. See Input images for image generation for additional requirements. -
controlMode
(Optional) – Specifies what conditioning mode is be used. The default value is "CANNY_EDGE".-
CANNY_EDGE
– Elements of the generated image will follow the prominent contours, or "edges", of the condition image closely. -
SEGMENTATION
– The condition image will be automatically analyzed to identify prominent content shapes. This analysis results in a segmentation mask which guides the generation, resulting in a generated image that closely follows the layout of the condition image but allows the model more freedom within the bounds of each content area.
-
-
controlStrength
(Optional) – Specifies how similar the layout and composition of the generated image should be to theconditionImage
. The range is 0 to 1.0, and lower values introduce more randomness. The default value is 0.7. -
text
(Required) – A text prompt to generate the image. The prompt must be 1-1024 characters in length. -
negativeText
(Optional) – A text prompt to define what not to include in the image. This value must be 1-1024 characters in length.
Note
Avoid using negating words (“no”, “not”, “without”, etc.) in your
text
andnegativeText
values. For example, if you do not want mirrors in an image, instead of including "no mirrors" or "without mirrors" in thetext
field, use the word "mirrors" in thenegativeText
field. -
- Color guided image generation request
-
{ "taskType": "COLOR_GUIDED_GENERATION", "colorGuidedGenerationParams": { "colors":
string[] (list of hexadecimal color values)
, "referenceImage":string (Base64 encoded image)
, "text":string
, "negativeText":string
}, "imageGenerationConfig": { "width":int
, "height":int
, "quality": "standard" | "premium", "cfgScale":float
, "seed":int
, "numberOfImages":int
} }The following
colorGuidedGenerationParams
fields are used in this request:-
colors
(Required) – A list of up to 10 color codes that define the desired color palette for your image. Expressed as hexadecimal values in the form “#RRGGBB”. For example, "#00FF00" is pure green and "#FCF2AB" is a warm yellow. Thecolors
list has the strongest effect when areferenceImage
is not provided. Otherwise, the colors in the list and the colors from the reference image will both be used in the final output. -
referenceImage
(Optional) – A JPEG or PNG image to use as a subject and style reference. The colors of the image will also be incorporated into you final output, along with the colors in from thecolors
list. See Input images for image generation for additional requirements. -
text
(Required) – A text prompt to generate the image. The prompt must be 1-1024 characters in length. -
negativeText
(Optional) – A text prompt to define what not to include in the image. This value must be 1-1024 characters in length.
Note
Avoid using negating words (“no”, “not”, “without”, etc.) in your
text
andnegativeText
values. For example, if you do not want mirrors in an image, instead of including "no mirrors" or "without mirrors" in thetext
field, use the word "mirrors" in thenegativeText
field. -
- Image variation request
-
{ "taskType": "IMAGE_VARIATION", "imageVariationParams": { "images":
string[] (list of Base64 encoded images)
, "similarityStrength":float
, "text":string
, "negativeText":string
}, "imageGenerationConfig": { "numberOfImages":int
, "height":int
, "width":int
, "cfgScale":float
, "seed":int
, "numberOfImages":int
} }The following
imageVariationParams
fields are used in this request:-
images
(Required) - A list of 1–5 images to use as references. Each must be in JPEG or PNG format and encoded as Base64 strings. See Input images for image generation for additional requirements. -
similarityStrength
(Optional) – Specifies how similar the generated image should be to the input images. Valid values are betweeen 0.2-1.0 with lower values used to introduce more randomness. -
text
(Optional) – A text prompt to generate the image. The prompt must be 1-1024 characters in length. If you omit this field, the model will remove elements inside the masked area. They will be replaced with a seamless extension of the image background. -
negativeText
(Optional) – A text prompt to define what not to include in the image. This value must be 1-1024 characters in length.
Note
Avoid using negating words (“no”, “not”, “without”, etc.) in your
text
andnegativeText
values. For example, if you do not want mirrors in an image, instead of including "no mirrors" or "without mirrors" in thetext
field, use the word "mirrors" in thenegativeText
field. -
- Inpainting request
-
{ "taskType": "INPAINTING", "inPaintingParams": { "image":
string (Base64 encoded image)
, "maskPrompt":string
, "maskImage":string (Base64 encoded image)
, "text":string
, "negativeText":string
}, "imageGenerationConfig": { "numberOfImages":int
, "quality": "standard" | "premium", "cfgScale":float
, "seed":int
} }The following
inPaintingParams
fields are used in this request:-
image
(Required) - The JPEG or PNG that you want to modify, formatted as a Base64 string. See Input images for image generation for additional requirements. -
maskPrompt
ormaskImage
(Required) – You must specify either themaskPrompt
or themaskImage
parameter, but not both.The
maskPrompt
is a natural language text prompt that describes the regions of the image to edit.The
maskImage
is an image that defines the areas of the image to edit. The mask image must be the same size as the input image. Areas to be edited are shared pure black and areas to ignore are shaded pure white. No other colors are allowed in the mask image. -
text
(Optional) – A text prompt that describes what to generate within the masked region. The prompt must be 1-1024 characters in length. If you omit this field, the model will remove elements inside the masked area. They will be replaced with a seamless extension of the image background. -
negativeText
(Optional) – A text prompt to define what not to include in the image. This value must be 1-1024 characters in length.
Note
Avoid using negating words (“no”, “not”, “without”, etc.) in your
text
andnegativeText
values. For example, if you do not want mirrors in an image, instead of including "no mirrors" or "without mirrors" in thetext
field, use the word "mirrors" in thenegativeText
field. -
- Outpainting request
-
{ "taskType": "OUTPAINTING", "outPaintingParams": { "image":
string (Base64 encoded image)
, "maskPrompt":string
, "maskImage":string (Base64 encoded image)
, "outPaintingMode": "DEFAULT" | "PRECISE", "text":string
, "negativeText":string
}, "imageGenerationConfig": { "numberOfImages":int
, "quality": "standard" | "premium" "cfgScale":float
, "seed":int
} }The following
outPaintingParams
fields are used in this request:-
image
(Required) - The JPEG or PNG that you want to modify, formatted as a Base64 string. See Input images for image generation for additional requirements. -
maskPrompt
ormaskImage
(Required) – You must specify either themaskPrompt
or themaskImage
parameter, but not both.The
maskPrompt
is a natural language text prompt that describes the regions of the image to edit.The
maskImage
is an image that defines the areas of the image to edit. The mask image must be the same size as the input image. Areas to be edited are shared pure black and areas to ignore are shaded pure white. No other colors are allowed in the mask image. -
outPaintingMode
- Determines how the mask that you provide is interpreted.Use
DEFAULT
to transition smoothly between the masked area and the non-masked area. Some of the original pixels are used as the starting point for the new background. This mode is generally better when you want the new background to use similar colors as the original background. However, you can get a halo effect if your prompt calls for a new background that is significantly different than the original background.Use
PRECISE
to strictly adhere to the mask boundaries. This mode is generally better when you are making significant changes to the background. -
text
(Optional) – A text prompt that describes what to generate within the masked region. The prompt must be 1-1024 characters in length. If you omit this field, the model will remove elements inside the masked area. They will be replaced with a seamless extension of the image background. -
negativeText
(Optional) – A text prompt to define what not to include in the image. This value must be 1-1024 characters in length.
Note
Avoid using negating words (“no”, “not”, “without”, etc.) in your
text
andnegativeText
values. For example, if you do not want mirrors in an image, instead of including "no mirrors" or "without mirrors" in thetext
field, use the word "mirrors" in thenegativeText
field. -
- Background removal request
-
{ "taskType": "BACKGROUND_REMOVAL", "backgroundRemovalParams": { "image":
string (Base64 encoded image)
} }The following
backgroundRemovalParams
field is used in this request:-
image
(Required) – The JPEG or PNG that you want to modify, formatted as a Base64 string. See Input images for image generation for additional requirements.
The
BACKGROUND_REMOVAL
task will return a PNG image with full 8-bit transparency. This format gives you smooth, clean isolation of the foreground objects and makes it easy to composite the image with other elements in an image editing app, presentation, or website. The background can easily be changed to a solid color using simple custom code. -
- Response body
-
{ "images": "images": string[] (list of Base64 encoded images), "error": string }
The response body will contain one or more of the following fields:
-
images – When successful, a list of Base64-encoded strings that represent each image that was generated is returned. This list does not always contain the same number of images that you requested. Individual images might be blocked after generation if they do not align with the AWS Responsible AI (RAI) content moderation policy. Only images that align with the RAI policy are returned.
-
error – If any image does not align with the RAI policy, this field is returned. Otherwise, this field is omitted from the response.
-
The imageGenerationConfig
field is common to all task types except
BACKGROUND_REMOVAL
. It is optional and contains the following fields. If you omit
this object, the default configurations are used.
-
width
andheight
(Optional) – Define the size and aspect ratio of the generated image. Both default to 1024. For the full list of supported resolutions, see Supported image resolutions. -
quality
(Optional) - Specifies the quality to use when generating the image - "standard" (default) or "premium". -
cfgScale
(Optional) – Specifies how strongly the generated image should adhere to the prompt. Use a lower value to introduce more randomness in the generation.Minimum Maximum Default 1.1 10 6.5 -
numberOfImages
(Optional) – The number of images to generate.Minimum Maximum Default 1 5 1 -
seed
(Optional) – Determines the initial noise setting for the generation process. Changing the seed value while leaving all other parameters the same will produce a totally new image that still adheres to your prompt, dimensions, and other settings. It is common to experiment with a variety of seed values to find the perfect image.Minimum Maximum Default 0 858,993,459 12
Important
Resolution (width
and height
), numberOfImages
, and
quality
all have an impact on the time it takes for generation to complete. The
AWS SDK has a default read_timeout
of 60 seconds which can easily be exceeded
when using higher values for these parameters. Therefore, it is recommended that you increase
the read_timeout
of your invocation calls to at least 5 minutes (300 seconds).
The code examples demonstrate how to do this.