Block harmful images with the image content filters

Focus mode

Block harmful images with the image content filters - Amazon Bedrock

Using the image content filter Configuring content filters for images with API Configuring the image filter to work with ApplyGuardrail API

Note

Guardrails image content filters for Amazon Bedrock is in preview release, and is subject to change.

Block harmful images with content filters (Preview)

Amazon Bedrock Guardrails can help block inappropriate or harmful images by enabling image as a modality while configuring content filters within a guardrail.

Prerequisites and Limitations

The support to detect and block harmful images in content filters is currently in preview and not recommended for production workloads.
This capability is supported for only images and not supported for images with embedded video content.
This capability is only supported for Hate, Insults, Sexual, and Violence categories within content filters, and not for any other categories including misconduct and prompt attacks.
Users can upload images with sizes up to a maximum of 4 MB, with a maximum of 20 images for a single request.
Only PNG and JPEG formats are supported for image content.

Overview

The detection and blocking of harmful images is supported for Hate, Insults, Sexual, and Violence categories within content filters, and for images without any text in them. In addition to text, users can select the image modality for the above categories within content filters while creating a guardrail, and set the respective filtering strength to NONE, LOW, MEDIUM, or HIGH. These thresholds will be common to both text and image content for these categories, if both text and image are selected. Guardrails will evaluate images sent as an input by users, or generated as output from the model responses.

The four supported categories for detection of harmful image content are described below:

Hate – Describes contents that discriminate, criticize, insult, denounce, or dehumanize a person or group on the basis of an identity (such as race, ethnicity, gender, religion, sexual orientation, ability, and national origin). It also includes graphic and real-life visual content displaying symbols of hate groups, hateful symbols, and imagery associated with various organizations promoting discrimination, racism, and intolerance.
Insults – Describes content that includes demeaning, humiliating, mocking, insulting, or belittling language. This type of language is also labeled as bullying. It also encompasses various forms of rude, disrespectful or offensive hand gestures intended to express contempt, anger, or disapproval.
Sexual – Describes content that indicates sexual interest, activity, or arousal using direct or indirect references to body parts, physical traits, or sex. It also includes images displaying private parts and sexual activity involving intercourse. This category also encompasses cartoons, animé, drawings, sketches, and other illustrated content with sexual themes.
Violence – Describes content that includes glorification of or threats to inflict physical pain, hurt, or injury toward a person, group, or thing.

Amazon Bedrock Guardrails image content filter is supported in the following Regions (for more information about Regions supported in Amazon Bedrock see Amazon Bedrock endpoints and quotas):

US East (N. Virginia)
US East (Ohio)
US West (Oregon)
AWS GovCloud (US-West)
Asia Pacific (Tokyo)
Asia Pacific (Seoul)
Asia Pacific (Mumbai)
Asia Pacific (Singapore)
Asia Pacific (Sydney)
Europe (Frankfurt)
Europe (Ireland)
Europe (London)

Amazon Bedrock Guardrails image content filter is supported for the following foundation models (to see which Regions support each model, refer to Supported foundation models in Amazon Bedrock):

Amazon Titan Image Generator G1 v2
Amazon Titan Image Generator G1
Anthropic Claude 3 Haiku
Anthropic Claude 3 Opus
Anthropic Claude 3 Sonnet
Anthropic Claude 3.5 Sonnet
Meta Llama 3.2 11B Instruct
Meta Llama 3.2 90B Instruct
Stability AI Stable Image Core 1.0
Stability AI Stable Image Ultra 1.0

Topics

Using the image content filter
Configuring content filters for images with API
Configuring the image filter to work with ApplyGuardrail API

Using the image content filter

Creating or updating a Guardrail with content filters for images

While creating a new guardrail or updating an existing guardrail, users will now see an option to select image (in preview) in addition to the existing text option. The image option is available for Hate, Insults, Sexual, or Violence categories. (Note: By default, the text option is enabled, and the image option needs to be explicitly enabled. Users can choose both text and image or either one of them depending on the use case.

Filter classification and blocking levels

Filtering is done based on the confidence classification of user inputs and FM responses. All user inputs and model responses are classified across four strength levels - None, Low, Medium, and High. The filter strength determines the sensitivity of filtering harmful content. As the filter strength is increased, the likelihood of filtering harmful content increases and the probability of seeing harmful content in your application decreases. When both image and text options are selected, the same filter strength is applied to both modalities for a particular category.

To configure image and text filters for harmful categories, select Configure harmful categories filter.

Note
Image content filters are in preview and will not be available if the model does not use images for model prompts or responses.
Select Text and/or Image to filter text or image content from prompts or responses to and from the model.
Select None, Low, Medium, or High for the level of filtration you want to apply to each category. A setting of High helps to block the most text or images that apply to that category of the filter.
Select Use the same harmful categories filters for responses to use the same filter settings you used for prompts. You can also choose to have different filter levels for prompts or responses by not selecting this option. Select Reset threshold to reset all the filter levels for prompts or responses.
Select Review and create or Next to create the guardrail.

Configuring content filters for images with API

You can use the guardrail API to configure the image content filter in Amazon Bedrock Guardrails. The example below shows an Amazon Bedrock Guardrails filter with different harmful content categories and filter strengths applied. You can use this template as an example for your own use case.

With the contentPolicyConfig operation, filtersConfig is a object, as shown in the following example.

Example Python Boto3 code for creating a Guardrail with Image Content Filters


import boto3
import botocore
import json


def main():
    bedrock = boto3.client('bedrock', region_name='us-east-1')
    try:
        create_guardrail_response = bedrock.create_guardrail(
            name='my-image-guardrail',
            contentPolicyConfig={
                'filtersConfig': [
                    {
                        'type': 'SEXUAL',
                        'inputStrength': 'HIGH',
                        'outputStrength': 'HIGH',
                        'inputModalities': ['TEXT', 'IMAGE'],
                        'outputModalities': ['TEXT', 'IMAGE']
                    },
                    {
                        'type': 'VIOLENCE',
                        'inputStrength': 'HIGH',
                        'outputStrength': 'HIGH',
                        'inputModalities': ['TEXT', 'IMAGE'],
                        'outputModalities': ['TEXT', 'IMAGE']
                    },
                    {
                        'type': 'HATE',
                        'inputStrength': 'HIGH',
                        'outputStrength': 'HIGH',
                        'inputModalities': ['TEXT', 'IMAGE'],
                        'outputModalities': ['TEXT', 'IMAGE']
                    },
                    {
                        'type': 'INSULTS',
                        'inputStrength': 'HIGH',
                        'outputStrength': 'HIGH',
                        'inputModalities': ['TEXT', 'IMAGE'],
                        'outputModalities': ['TEXT', 'IMAGE']
                    },
                    {
                        'type': 'MISCONDUCT',
                        'inputStrength': 'HIGH',
                        'outputStrength': 'HIGH',
                        'inputModalities': ['TEXT'],
                        'outputModalities': ['TEXT']
                    },
                    {
                        'type': 'PROMPT_ATTACK',
                        'inputStrength': 'HIGH',
                        'outputStrength': 'NONE',
                        'inputModalities': ['TEXT'],
                        'outputModalities': ['TEXT']
                    }
                ]
            },
            blockedInputMessaging='Sorry, the model cannot answer this question.',
            blockedOutputsMessaging='Sorry, the model cannot answer this question.',
        )
        create_guardrail_response['createdAt'] = create_guardrail_response['createdAt'].strftime('%Y-%m-%d %H:%M:%S')
        print("Successfully created guardrail with details:")
        print(json.dumps(create_guardrail_response, indent=2))
    except botocore.exceptions.ClientError as err:
        print("Failed while calling CreateGuardrail API with RequestId = " + err.response['ResponseMetadata']['RequestId'])
        raise err


if __name__ == "__main__":
    main()

Configuring the image filter to work with ApplyGuardrail API

You can use content filters for both image and text content using the ApplyGuardrail API. This option allows you to use the content filter settings without invoking the Amazon Bedrock model. You can update the request payload in the below script for various models by following the inference parameters documentation for each bedrock foundation model that is supported by Amazon Bedrock Guardrails.

You can update the request payload in below script for various models by following the inference parameters documentation for each bedrock foundation model that is supported by Amazon Bedrock Guardrails.


import boto3
import botocore
import json


guardrail_id = 'guardrail-id'
guardrail_version = 'DRAFT'
content_source = 'INPUT'
image_path = '/path/to/image.jpg'

with open(image_path, 'rb') as image:
    image_bytes = image.read()

content = [
    {
        "text": {
            "text": "Hi, can you explain this image art to me."
        }
    },
    {
        "image": {
            "format": "jpeg",
            "source": {
                "bytes": image_bytes
            }
        }
    }
]


def main():
    bedrock_runtime_client = boto3.client("bedrock-runtime", region_name="us-east-1")
    try:
        print("Making a call to ApplyGuardrail API now")
        response = bedrock_runtime_client.apply_guardrail(
            guardrailIdentifier=guardrail_id,
            guardrailVersion=guardrail_version,
            source=content_source,
            content=content
        )
        print("Received response from ApplyGuardrail API:")
        print(json.dumps(response, indent=2))
    except botocore.exceptions.ClientError as err:
        print("Failed while calling ApplyGuardrail API with RequestId = " + err.response['ResponseMetadata']['RequestId'])
        raise err


if __name__ == "__main__":
    main()

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Contextual grounding check

Prerequisites for using guardrails

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

Block harmful images with the image content filters

Note

Topics

Using the image content filter

Note

Configuring content filters for images with API

Configuring the image filter to work with ApplyGuardrail API

On this page

Did this page help you?

Next topic:

Previous topic:

Need help?