Create a guardrail
You create a guardrail by setting up the configurations, defining topics to deny, providing filters to handle harmful and sensitive content, and writing messages for when prompts and user responses are blocked.
A guardrail must contain at least one filter and messaging for when prompts and user responses are blocked. You can opt to use the default messaging. You can add filters and iterate upon your guardrail later by following the steps at Modify a guardrail to configure all the components that you need for your guardrail.
Choose the tab for your preferred method, and then follow the steps:
- Console
-
To create a guardrail in the AWS Console
-
Sign in to the AWS Management Console using an IAM role with Amazon Bedrock permissions, and open the Amazon Bedrock console at https://console.aws.amazon.com/bedrock/
. -
From the left navigation pane, select Guardrails.
-
In the Guardrails section, select Create guardrail.
-
On the Provide guardrail details page, do the following:
-
In the Guardrail details section, provide a Name and optional Description for the guardrail.
-
Enter a message for Blocked messaging for prompts that will be display when guardrails is invoked. Select the checkbox for Use the same blocked message for responses to use the same message when guardrails is invoked on the response.
-
(Optional) By default, your guardrail is encrypted with an AWS managed key. To use your own customer-managed KMS key, select the right arrow next to KMS key selection and select the Customize encryption settings (advanced) checkbox. You can select an existing AWS KMS key or select Create an AWS KMS key to create a new one.
-
For Guardrail creation options select Quick create with toxicity filters to use the default settings, or select Create your own guardrail to customize your guardrail settings. You can also select View and edit toxicity filters to view or customize your guardrail filter profanity and prompt attack filter settings.
-
(Optional) To add tags to your guardrail, select the right arrow next to Tags. Then, select Add new tag and define key-value pairs for your tags. For more information, see Tagging Amazon Bedrock resources.
-
Choose Next.
Note
You must configure at least one filter to create a guardrail. You can then select Create to skip the creation of other filters.
-
-
(Optional) On the Configure content filters page, set up how strongly you want to filter out content related to the categories defined in Block harmful words and conversations with content filters by doing the following:
-
To configure filters for harmful categories, select Configure harmful categories filter. Select Text and/or Image to filter text or image content from prompts or responses to the model. Select None, Low, Medium, or High for the level of filtration you want to apply to each category. You can choose to have different filter levels for prompts or responses. You can select the filter for prompt attacks in the harmful categories. Configure how strict you want each filter to be for prompts that the user provides to the model.
-
To configure filters for prompt attacked, select Enable prompt attacks filter. Configure how strictly you want the filter to detect and block jailbreak and prompt injection attacks.
-
Select Create to create the guardrail or select Use advanced filters to customize the filter settings.
-
-
(Optional) On the Add denied topics page, you can add denied topics or select Skip to Review and create.
-
To define a topic to block, select Add denied topic. Then do the following:
-
Enter a Name for the topic.
-
In the Definition for topic box, define the topic. For guidelines on how to define a denied topic, see Block denied topics to help remove harmful content.
-
(Optional) To add representative input prompts or model responses related to this topic, select the right arrow next to Add sample phrases. Enter a phrase in the box. To add another phrase, select Add phrase.
-
When you're done configuring the denied topic, select Confirm.
-
-
You can perform the following actions with the Denied topics.
-
To add another topic, select Add denied topic.
-
To edit a topic, select the three dots icon in the same row as the topic in the Actions column. Then select Edit. After you are finished editing, select Confirm.
-
To delete a topic or topics, select the checkboxes for the topics to delete. Select Delete and then select Delete selected.
-
To delete all the topics, select Delete and then select Delete all.
-
To configure the size of each page in the table or the column display in the table, select the settings icon ( ). Set your preferences and then select Confirm.
-
-
When you are finished configuring denied topics, select Next.
-
-
(Optional) On the Add word filters page, do the following:
-
In the Filter profanity section, select Filter profanity to block profanity in prompts and responses. The list of profanity is based on conventional definitions and is continually updated.
-
In the Add custom words and phrases section, select how to add words and phrases for the guardrail to block. If you select to upload a file, each line in the file should contain one word or a phrase of up to three words. Don't include a header. You have the following options:
Option Instructions Add words and phrases manually Directly add words and phrases in the View and edit words and phrases section. Upload from a local file To upload a .txt or .csv file containing the words and phrases, select Choose file after selecting this option. Upload from Amazon S3 object To upload a file from Amazon S3, specify the S3 object after selecting this option. Each line in the file should contain one word or a phrase of up to three words. -
You edit the words and phrases for the guardrail to block in the View and edit words and phrases section. You have the following options:
-
If you uploaded a word list from a local file or Amazon S3 object, this section will populate with your word list. To filter for items with errors, select Show errors.
-
To add an item to the word list, select Add word or phrase. Enter a word or a phrase of up to three words in the box and press Enter or select the checkmark icon to confirm the item.
-
To edit an item, select the edit icon ( ) next to the item.
-
To delete an item from the word list, select the trash can icon ( ) or, if you're editing an item, select the delete icon ( ) next to the item.
-
To delete items that contain errors, select Delete all and then select Delete all rows with error
-
To delete all items, select Delete all and then select Delete all rows
-
To search for an item, enter an expression in the search bar.
-
To show only items with errors, select the dropdown menu labeled Show all and select Show errors only.
-
To configure the size of each page in the table or the column display in the table, select the settings icon ( ). Set your preferences and then select Confirm.
-
By default, this section displays the Table editor. To switch to a text editor in which you can enter a word or phrase in each line, select Text editor. The Text editor provides the following features:
-
You can copy a word list from another text editor and paste it into this editor.
-
A red X icon appears next to items containing errors and a list of errors appears at the below the editor.
-
-
-
Select Skip to review and create to create the guardrail, or select Next to add filters for PII and regex patterns.
-
-
(Optional) On the Add sensitive information filters page, configure filters to block or mask sensitive information. For more information, see Remove PII from conversations by using sensitive information filters. Do the following:
-
In the PII types section, configure the personally identifiable information (PII) categories to block or mask. You have the following options:
-
To add a PII type, select Add a PII type. Then, do the following:
-
In the Type column, select a PII type.
-
In the Guardrail behavior column, select whether the guardrail should Block content containing the PII type or Mask it with an identifier.
-
-
To add all PII types, select the dropdown arrow next to Add a PII type. Then select the guardrail behavior to apply to them.
Warning
If you specify a behavior, any existing behavior that you configured for PII types will be overwritten.
-
To delete a PII type, select the trash can icon ( ).
-
To delete rows that contain errors, select Delete all and then select Delete all rows with error
-
To delete all PII types, select Delete all and then select Delete all rows
-
To search for a row, enter an expression in the search bar.
-
To show only rows with errors, select the dropdown menu labeled Show all and select Show errors only.
-
To configure the size of each page in the table or the column display in the table, select the settings icon ( ). Set your preferences and then select Confirm.
-
-
In the Regex patterns section, use regular expressions to define patterns for the guardrail to filter. You have the following options:
-
To add a pattern, select Add regex pattern. Configure the following fields:
Field Description Name A name for the pattern Regex pattern A regular expression that defines the pattern Guardrail behavior Choose whether to Block content containing the pattern or to Mask it with an identifier. To mask the pattern only in logs, select None. Add description (Optional) Write a description for the pattern -
To edit a pattern, select the three dots icon in the same row as the topic in the Actions column. Then select Edit. After you are finished editing, select Confirm.
-
To delete a pattern or patterns, select the checkboxes for the patterns to delete. Select Delete and then select Delete selected.
-
To delete all the patterns, select Delete and then select Delete all.
-
To search for a pattern, enter an expression in the search bar.
-
To configure the size of each page in the table or the column display in the table, select the settings icon ( ). Set your preferences and then select Confirm.
-
-
When you finish configuring sensitive information filters, select Next or Skip to review and create.
-
-
On the Add contextual grounding check page (optional), configure thresholds to block un-grounded or irrelevant information.
Note
For each type of check, you can move the slider or input a threshold value from 0 to 0.99. Select an appropriate threshold for your uses. A higher threshold requires responses to be grounded or relevant with a high degree of confidence to be allowed. Responses below the threshold will be filtered. To learn more about contextual grounding check, see Use contextual grounding check to filter hallucinations in responses.
-
In the Grounding field, select Enable grounding check to check if model responses are grounded.
-
In the Relevance field, select Enable relevance check to check if model responses are relevant..
-
Select Next.
-
-
Review and create – Review the settings for your guardrail.
-
Select Edit in any section you want to make changes to.
-
When you are satisfied with the settings for your guardrail, select Create to create the guardrail.
-
-
- API
-
To create a guardrail, send a CreateGuardrail request. The request format is as follows:
POST /guardrails HTTP/1.1 Content-type: application/json { "blockedInputMessaging": "string", "blockedOutputsMessaging": "string", "contentPolicyConfig": { "filtersConfig": [ { "inputStrength": "NONE | LOW | MEDIUM | HIGH", "outputStrength": "NONE | LOW | MEDIUM | HIGH", "type": "SEXUAL | VIOLENCE | HATE | INSULTS | MISCONDUCT | PROMPT_ATTACK" } ] }, "wordPolicyConfig": { "wordsConfig": [ { "text": "string" } ], "managedWordListsConfig": [ { "type": "string" } ] }, "sensitiveInformationPolicyConfig": { "piiEntitiesConfig": [ { "type": "string", "action": "string" } ], "regexesConfig": [ { "name": "string", "description": "string", "regex": "string", "action": "string" } ] }, "description": "string", "kmsKeyId": "string", "name": "string", "tags": [ { "key": "string", "value": "string" } ], "topicPolicyConfig": { "topicsConfig": [ { "definition": "string", "examples": [ "string" ], "name": "string", "type": "DENY" } ] } }
-
Specify a
name
anddescription
for the guardrail. -
Specify messages for when the guardrail successfully blocks a prompt or a model response in the
blockedInputMessaging
andblockedOutputsMessaging
fields. -
Specify topics for the guardrail to deny in the
topicPolicy
object. Each item in thetopics
list pertains to one topic. For more information about the fields in a topic, see Topic.-
Give a
name
anddescription
so that the guardrail can properly identify the topic. -
Specify
DENY
in theaction
field. -
(Optional) Provide up to five examples that you would categorize as belonging to the topic in the
examples
list.
-
-
Specify filter strengths for the harmful categories defined in Amazon Bedrock in the
contentPolicy
object. Each item in thefilters
list pertains to a harmful category. For more information, see Block harmful words and conversations with content filters. For more information about the fields in a content filter, see ContentFilter.-
Specify the category in the
type
field. -
Specify the strength of the filter for prompts in the
strength
field of thetextToTextFiltersForPrompt
field and for model responses in thestrength
field of thetextToTextFiltersForResponse
.
-
-
(Optional) Attach any tags to the guardrail. For more information, see Tagging Amazon Bedrock resources.
-
(Optional) For security, include the ARN of a KMS key in the
kmsKeyId
field.
The response format is as follows:
HTTP/1.1 202 Content-type: application/json { "createdAt": "string", "guardrailArn": "string", "guardrailId": "string", "version": "string" }
-