Stability.ai Diffusion 1.0 image to image
The Stability.ai Diffusion 1.0 model has the following inference parameters and
model response for making image to image inference calls.
Request and Response
The request body is passed in the body
field of a request to
InvokeModel or InvokeModelWithResponseStream.
For more information,
see https://platform.stability.ai/docs/api-reference#tag/v1generation/operation/imageToImage.
- Request
-
The Stability.ai Diffusion 1.0 model has the following
inference parameters for an image to image inference call.
{
"text_prompts": [
{
"text": string,
"weight": float
}
],
"init_image" : string ,
"init_image_mode" : string,
"image_strength" : float,
"cfg_scale": float,
"clip_guidance_preset": string,
"sampler": string,
"samples" : int,
"seed": int,
"steps": int,
"style_preset": string,
"extras" : json object
}
The following are required parameters.
-
text_prompts
– (Required) An array of text prompts to use for
generation. Each element is a JSON object that contains
a prompt and a weight for the prompt.
text – The prompt
that you want to pass to the model.
-
weight –
(Optional) The weight that the model should apply to the
prompt. A value that is less
than zero declares a negative prompt. Use a negative
prompt to tell the model to avoid certain concepts.
The default value for weight
is one.
-
init_image
– (Required) The base64 encoded image that you want to use to
initialize the diffusion process.
The following are optional parameters.
-
init_image_mode
– (Optional) Determines whether to use image_strength
or
step_schedule_*
to control how much influence the image in init_image
has
on the result. Possible values are IMAGE_STRENGTH
or STEP_SCHEDULE
.
The default is IMAGE_STRENGTH.
-
image_strength –
(Optional) Determines how much influence the source image in
init_image
has on the diffusion process.
Values close to 1 yield images very similar to the source
image. Values close to 0 yield images very different than
the source image.
-
cfg_scale –
(Optional) Determines how much the final image portrays the
prompt. Use a lower number to increase randomness in the
generation.
Default |
Minimum |
Maximum |
7
|
0
|
35
|
-
clip_guidance_preset – (Optional) Enum:
FAST_BLUE, FAST_GREEN, NONE, SIMPLE, SLOW, SLOWER,
SLOWEST
.
-
sampler –
(Optional) The sampler to use for the diffusion process. If this value is
omitted, the model automatically selects an appropriate sampler for you.
Enum: DDIM DDPM, K_DPMPP_2M, K_DPMPP_2S_ANCESTRAL, K_DPM_2,
K_DPM_2_ANCESTRAL, K_EULER, K_EULER_ANCESTRAL, K_HEUN K_LMS
.
-
samples –
(Optional) The number of image to generate. Currently Amazon Bedrock supports generating
one image. If you supply a value for samples
, the value
must be one.
Default |
Minimum |
Maximum |
1
|
1
|
1
|
-
seed – (Optional) The
seed determines the initial noise setting. Use the same seed and the same
settings as a previous run to allow inference to create a similar image. If you
don't set this value, or the value is 0, it is set as a random number.
Default |
Minimum |
Maximum |
0
|
0
|
4294967295
|
-
steps –
(Optional) Generation step determines how many times the image is sampled. More
steps can result in a more accurate result.
Default |
Minimum |
Maximum |
30
|
10
|
50
|
-
style_preset
– (Optional) A style preset that guides the image model towards a
particular style. This list of style presets is subject to change.
Enum: 3d-model, analog-film, animé, cinematic, comic-book, digital-art,
enhance, fantasy-art, isometric, line-art, low-poly, modeling-compound, neon-punk,
origami, photographic, pixel-art, tile-texture
-
extras – (Optional)
Extra parameters passed to the engine. Use with caution.
These parameters are used for in-development or experimental
features and might change without warning.
- Response
-
The Stability.ai Diffusion 1.0 model returns the following fields
for a text to image inference call.
{
"result": string,
"artifacts": [
{
"seed": int,
"base64": string,
"finishReason": string
}
]
}
result – The result of the operation.
If successful, the response is success
.
-
artifacts – An
array of images, one for each requested image.
Code example
The following example shows how to run inference with the Stability.ai Diffusion 1.0
model and on demand throughput. The example submits a text prompt and reference image to a model, retrieves
the response from the model, and finally shows the image.
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0
"""
Shows how to generate an image from a reference image with SDXL 1.0 (on demand).
"""
import base64
import io
import json
import logging
import boto3
from PIL import Image
from botocore.exceptions import ClientError
class ImageError(Exception):
"Custom exception for errors returned by SDXL"
def __init__(self, message):
self.message = message
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)
def generate_image(model_id, body):
"""
Generate an image using SDXL 1.0 on demand.
Args:
model_id (str): The model ID to use.
body (str) : The request body to use.
Returns:
image_bytes (bytes): The image generated by the model.
"""
logger.info("Generating image with SDXL model %s", model_id)
bedrock = boto3.client(service_name='bedrock-runtime')
accept = "application/json"
content_type = "application/json"
response = bedrock.invoke_model(
body=body, modelId=model_id, accept=accept, contentType=content_type
)
response_body = json.loads(response.get("body").read())
print(response_body['result'])
base64_image = response_body.get("artifacts")[0].get("base64")
base64_bytes = base64_image.encode('ascii')
image_bytes = base64.b64decode(base64_bytes)
finish_reason = response_body.get("artifacts")[0].get("finishReason")
if finish_reason == 'ERROR' or finish_reason == 'CONTENT_FILTERED':
raise ImageError(f"Image generation error. Error code is {finish_reason}")
logger.info("Successfully generated image withvthe SDXL 1.0 model %s", model_id)
return image_bytes
def main():
"""
Entrypoint for SDXL example.
"""
logging.basicConfig(level = logging.INFO,
format = "%(levelname)s: %(message)s")
model_id='stability.stable-diffusion-xl-v1'
prompt="""A space ship."""
# Read reference image from file and encode as base64 strings.
with open("/path/to/image", "rb") as image_file:
init_image = base64.b64encode(image_file.read()).decode('utf8')
# Create request body.
body=json.dumps({
"text_prompts": [
{
"text": prompt
}
],
"init_image": init_image,
"style_preset" : "isometric"
})
try:
image_bytes=generate_image(model_id = model_id,
body = body)
image = Image.open(io.BytesIO(image_bytes))
image.show()
except ClientError as err:
message=err.response["Error"]["Message"]
logger.error("A client error occurred: %s", message)
print("A client error occured: " +
format(message))
except ImageError as err:
logger.error(err.message)
print(err.message)
else:
print(f"Finished generating text with SDXL model {model_id}.")
if __name__ == "__main__":
main()