教程：开始使用 Amazon A2I API

本教程介绍了您在开始使用 Amazon A2I 时可以使用的 API 操作。

要使用 Jupyter 笔记本运行这些操作，请从中选择一个 Jupyter 笔记本使用 Amazon A2I 的使用场景和示例并使用它将 SageMaker 笔记本实例与 Amazon A2I Jupyter 笔记本配合使用来学习如何在 A SageMaker I 笔记本实例中使用它。

要详细了解可用于 Amazon A2I 的 API 操作，请参阅 APIs 在 Amazon Agumented AI 中使用。

创建私有工作团队

您可以创建一个私有工作团队并将自己添加为员工，这样您就可以预览 Amazon A2I。

如果您不熟悉 Amazon Cognito，我们建议您使用 SageMaker AI 控制台创建私人员工队伍，并将自己添加为私人员工。有关说明，请参阅步骤 1：创建工作团队。

如果您熟悉 Amazon Cognito，则可以按照以下说明使用 API 创建私人工作团队。 SageMaker 创建工作团队后，请记录工作团队的 ARN (WorkteamArn)。

要了解有关私有人力和其他可用配置的更多信息，请参阅私有人力。

创建私有人力

如果您还没有创建私人人力，则可以使用 Amazon Cognito 用户池创建。确保已将自己添加到此用户池中。您可以使用该适用于 Python (Boto3) 的 AWS SDK create_workforce功能创建私人工作团队。有关其他特定语言的信息 SDKs，请参阅中的列表。CreateWorkforce


    
    response = client.create_workforce(
        CognitoConfig={
            "UserPool": "Pool_ID",
            "ClientId": "app-client-id"
        },
        WorkforceName="workforce-name"
    )

创建私有工作团队

AWS 在该地区创建了一支私人员工队伍来配置和启动您的人工循环后，您可以使用该适用于 Python (Boto3) 的 AWS SDK create_workteam功能创建私人工作团队。有关其他特定语言的信息 SDKs，请参阅中的列表。CreateWorkteam



    response = client.create_workteam(
        WorkteamName="work-team-name",
        WorkforceName= "workforce-name",
        MemberDefinitions=[
            {
                "CognitoMemberDefinition": {
                    "UserPool": "<aws-region>_ID",
                    "UserGroup": "user-group",
                    "ClientId": "app-client-id"
                },
            }
        ]
    )

如下所示访问您的工作团队 ARN：



    workteamArn = response["WorkteamArn"]

列出账户中的私有工作团队

如果您已经创建了私人工作团队，则可以使用该适用于 Python (Boto3) 的 AWS SDK list_workteams功能列出账户中给定 AWS 区域的所有工作团队。有关其他特定语言的信息 SDKs，请参阅中的列表。ListWorkteams



    response = client.list_workteams()

如果您的账户中有多个工作团队，则可能需要使用 MaxResults、SortBy 和 NameContains 来筛选结果。

创建人工审核工作流

您可以使用 Amazon A2I CreateFlowDefinition 操作创建人工审核工作流。在创建人工审核工作流之前，您需要创建人工任务 UI。您可以使用 CreateHumanTaskUi 操作来创建。

如果您将 Amazon A2I 与 Amazon Textract 或 Amazon Rekognition 集成结合使用，则可以使用 JSON 指定激活条件。

创建人工任务 UI

如果您正在创建用于 Amazon Textract 或 Amazon Rekognition 集成的人工审核工作流，则需要使用和修改预先制作的工作人员任务模板。对于所有自定义集成，您可以使用自己的自定义工作人员任务模板。使用下表了解如何使用工作人员任务模板，为两个内置的集成创建人工任务 UI。使用自己的模板替换模板来自定义此请求。

Amazon Textract – Key-value pair extraction

要了解有关模板版本的更多信息，请参阅 Amazon Textract 的自定义模板示例。


template = r"""
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
{% capture s3_uri %}http://s3.amazonaws.com/{{ task.input.aiServiceRequest.document.s3Object.bucket }}/{{ task.input.aiServiceRequest.document.s3Object.name }}{% endcapture %}
<crowd-form>
  <crowd-textract-analyze-document 
      src="{{ s3_uri | grant_read_access }}" 
      initial-value="{{ task.input.selectedAiServiceResponse.blocks }}" 
      header="Review the key-value pairs listed on the right and correct them if they don"t match the following document." 
      no-key-edit="" 
      no-geometry-edit="" 
      keys="{{ task.input.humanLoopContext.importantFormKeys }}" 
      block-types='["KEY_VALUE_SET"]'>
    <short-instructions header="Instructions">
        <p>Click on a key-value block to highlight the corresponding key-value pair in the document.
        </p><p><br></p>
        <p>If it is a valid key-value pair, review the content for the value. If the content is incorrect, correct it.
        </p><p><br></p>
        <p>The text of the value is incorrect, correct it.</p>
        <p><img src="https://assets.crowd.aws/images/a2i-console/correct-value-text.png">
        </p><p><br></p>
        <p>A wrong value is identified, correct it.</p>
        <p><img src="https://assets.crowd.aws/images/a2i-console/correct-value.png">
        </p><p><br></p>
        <p>If it is not a valid key-value relationship, choose No.</p>
        <p><img src="https://assets.crowd.aws/images/a2i-console/not-a-key-value-pair.png">
        </p><p><br></p>
        <p>If you can’t find the key in the document, choose Key not found.</p>
        <p><img src="https://assets.crowd.aws/images/a2i-console/key-is-not-found.png">
        </p><p><br></p>
        <p>If the content of a field is empty, choose Value is blank.</p>
        <p><img src="https://assets.crowd.aws/images/a2i-console/value-is-blank.png">
        </p><p><br></p>
        <p><strong>Examples</strong></p>
        <p>Key and value are often displayed next or below to each other.
        </p><p><br></p>
        <p>Key and value displayed in one line.</p>
        <p><img src="https://assets.crowd.aws/images/a2i-console/sample-key-value-pair-1.png">
        </p><p><br></p>
        <p>Key and value displayed in two lines.</p>
        <p><img src="https://assets.crowd.aws/images/a2i-console/sample-key-value-pair-2.png">
        </p><p><br></p>
        <p>If the content of the value has multiple lines, enter all the text without line break. 
        Include all value text even if it extends beyond the highlight box.</p>
        <p><img src="https://assets.crowd.aws/images/a2i-console/multiple-lines.png"></p>
    </short-instructions>
    <full-instructions header="Instructions"></full-instructions>
  </crowd-textract-analyze-document>
</crowd-form>
"""

Amazon Rekognition – Image moderation

要了解有关模板版本的更多信息，请参阅 Amazon Rekognition 的自定义模板示例。


template = r"""
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
{% capture s3_uri %}http://s3.amazonaws.com/{{ task.input.aiServiceRequest.image.s3Object.bucket }}/{{ task.input.aiServiceRequest.image.s3Object.name }}{% endcapture %}

<crowd-form>
  <crowd-rekognition-detect-moderation-labels
    categories='[
      {% for label in task.input.selectedAiServiceResponse.moderationLabels %}
        {
          name: "{{ label.name }}",
          parentName: "{{ label.parentName }}",
        },
      {% endfor %}
    ]'
    src="{{ s3_uri | grant_read_access }}"
    header="Review the image and choose all applicable categories."
  >
    <short-instructions header="Instructions">
      <style>
        .instructions {
          white-space: pre-wrap;
        }
      </style>
      <p class="instructions">Review the image and choose all applicable categories.
If no categories apply, choose None.

<b>Nudity</b>
Visuals depicting nude male or female person or persons

<b>Partial Nudity</b>
Visuals depicting covered up nudity, for example using hands or pose

<b>Revealing Clothes</b>
Visuals depicting revealing clothes and poses

<b>Physical Violence</b>
Visuals depicting violent physical assault, such as kicking or punching

<b>Weapon Violence</b>
Visuals depicting violence using weapons like firearms or blades, such as shooting

<b>Weapons</b>
Visuals depicting weapons like firearms and blades
    </short-instructions>

    <full-instructions header="Instructions"></full-instructions>
  </crowd-rekognition-detect-moderation-labels>
</crowd-form>"""

Custom Integration

以下是可以在自定义集成中使用的示例模板。该笔记本中使用了此模板，演示与 Amazon Comprehend 的自定义集成。


template = r"""
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>

<crowd-form>
    <crowd-classifier
      name="sentiment"
      categories='["Positive", "Negative", "Neutral", "Mixed"]'
      initial-value="{{ task.input.initialValue }}"
      header="What sentiment does this text convey?"
    >
      <classification-target>
        {{ task.input.taskObject }}
      </classification-target>
      
      <full-instructions header="Sentiment Analysis Instructions">
        <p><strong>Positive</strong> sentiment include: joy, excitement, delight</p>
        <p><strong>Negative</strong> sentiment include: anger, sarcasm, anxiety</p>
        <p><strong>Neutral</strong>: neither positive or negative, such as stating a fact</p>
        <p><strong>Mixed</strong>: when the sentiment is mixed</p>
      </full-instructions>

      <short-instructions>
       Choose the primary sentiment that is expressed by the text. 
      </short-instructions>
    </crowd-classifier>
</crowd-form>
"""

使用上面指定的模板，您可以使用适用于 Python (Boto3) 的 AWS SDK create_human_task_ui函数创建模板。有关其他特定语言的信息 SDKs，请参阅中的列表。CreateHumanTaskUi


    
    response = client.create_human_task_ui(
        HumanTaskUiName="human-task-ui-name",
        UiTemplate={
            "Content": template
        }
    )

此响应元素包含人工任务 UI ARN。如下所示保存此内容：



    humanTaskUiArn = response["HumanTaskUiArn"]

创建 JSON 以指定激活条件

对于 Amazon Textract 和 Amazon Rekognition 内置集成，您可以将激活条件保存在 JSON 对象中，然后在 CreateFlowDefinition 请求中使用。

接下来，选择一个选项卡以查看可用于这些内置集成的示例激活条件。有关激活条件选项的其他信息，请参阅Amazon Augmented AI 中用于人工循环激活条件的 JSON 架构。

Amazon Textract – Key-value pair extraction

此示例为文档中的特定键（例如 Mail address）指定条件。如果 Amazon Textract 的置信度在此处设定的阈值之外，则会将文档发送给人员进行审核，并向工作人员提示引发人工循环的特定键。



      import json  

      humanLoopActivationConditions = json.dumps(
        {
            "Conditions": [
                {
                  "Or": [
                    
                    {
                        "ConditionType": "ImportantFormKeyConfidenceCheck",
                        "ConditionParameters": {
                            "ImportantFormKey": "Mail address",
                            "ImportantFormKeyAliases": ["Mail Address:","Mail address:", "Mailing Add:","Mailing Addresses"],
                            "KeyValueBlockConfidenceLessThan": 100,
                            "WordBlockConfidenceLessThan": 100
                        }
                    },
                    {
                        "ConditionType": "MissingImportantFormKey",
                        "ConditionParameters": {
                            "ImportantFormKey": "Mail address",
                            "ImportantFormKeyAliases": ["Mail Address:","Mail address:","Mailing Add:","Mailing Addresses"]
                        }
                    },
                    {
                        "ConditionType": "ImportantFormKeyConfidenceCheck",
                        "ConditionParameters": {
                            "ImportantFormKey": "Phone Number",
                            "ImportantFormKeyAliases": ["Phone number:", "Phone No.:", "Number:"],
                            "KeyValueBlockConfidenceLessThan": 100,
                            "WordBlockConfidenceLessThan": 100
                        }
                    },
                    {
                      "ConditionType": "ImportantFormKeyConfidenceCheck",
                      "ConditionParameters": {
                        "ImportantFormKey": "*",
                        "KeyValueBlockConfidenceLessThan": 100,
                        "WordBlockConfidenceLessThan": 100
                      }
                    },
                    {
                      "ConditionType": "ImportantFormKeyConfidenceCheck",
                      "ConditionParameters": {
                        "ImportantFormKey": "*",
                        "KeyValueBlockConfidenceGreaterThan": 0,
                        "WordBlockConfidenceGreaterThan": 0
                      }
                    }
            ]
        }
            ]
        }
    )

Amazon Rekognition – Image moderation

此处使用的人工循环激活条件针对 Amazon Rekognition 内容审核定制；它们基于 Suggestive 和 Female Swimwear Or Underwear 审核标签的置信度阈值。



        import json  

        humanLoopActivationConditions = json.dumps(
        {
            "Conditions": [
                {
                  "Or": [
                    {
                        "ConditionType": "ModerationLabelConfidenceCheck",
                        "ConditionParameters": {
                            "ModerationLabelName": "Suggestive",
                            "ConfidenceLessThan": 98
                        }
                    },
                    {
                        "ConditionType": "ModerationLabelConfidenceCheck",
                        "ConditionParameters": {
                            "ModerationLabelName": "Female Swimwear Or Underwear",
                            "ConfidenceGreaterThan": 98
                        }
                    }
                  ]
               }
            ]
        }
    )

创建人工审核工作流

本节举例说明了使用前几节中创建的资源进行的CreateFlowDefinition 适用于 Python (Boto3) 的 AWS SDK 请求。有关其他特定语言的信息 SDKs，请参阅中的列表。CreateFlowDefinition使用下表中的选项卡，查看为 Amazon Textract 和 Amazon Rekognition 内置集成创建人工审核工作流的请求。

Amazon Textract – Key-value pair extraction

如果您使用与 Amazon Textract 的内置集成，则必须在 HumanLoopRequestSource 中为 "AwsManagedHumanLoopRequestSource" 指定 "AWS/Textract/AnalyzeDocument/Forms/V1"。



    response = client.create_flow_definition(
        FlowDefinitionName="human-review-workflow-name",
        HumanLoopRequestSource={
            "AwsManagedHumanLoopRequestSource": "AWS/Textract/AnalyzeDocument/Forms/V1"
        }, 
        HumanLoopActivationConfig={
            "HumanLoopActivationConditionsConfig": {
                "HumanLoopActivationConditions": humanLoopActivationConditions
            }
        },
        HumanLoopConfig={
            "WorkteamArn": workteamArn,
            "HumanTaskUiArn": humanTaskUiArn,
            "TaskTitle": "Document entry review",
            "TaskDescription": "Review the document and instructions. Complete the task",
            "TaskCount": 1,
            "TaskAvailabilityLifetimeInSeconds": 43200,
            "TaskTimeLimitInSeconds": 3600,
            "TaskKeywords": [
                "document review",
            ],
        },
        OutputConfig={
            "S3OutputPath": "s3://amzn-s3-demo-bucket/prefix/",
        },
        RoleArn="arn:aws:iam::<account-number>:role/<role-name>",
        Tags=[
            {
                "Key": "string",
                "Value": "string"
            },
        ]
    )

Amazon Rekognition – Image moderation

如果您使用与 Amazon Rekognition 的内置集成，则必须在 HumanLoopRequestSource 中为 "AwsManagedHumanLoopRequestSource" 指定 "AWS/Rekognition/DetectModerationLabels/Image/V3"。



    response = client.create_flow_definition(
        FlowDefinitionName="human-review-workflow-name",
        HumanLoopRequestSource={
            "AwsManagedHumanLoopRequestSource": "AWS/Rekognition/DetectModerationLabels/Image/V3"
        }, 
        HumanLoopActivationConfig={
            "HumanLoopActivationConditionsConfig": {
                "HumanLoopActivationConditions": humanLoopActivationConditions
            }
        },
        HumanLoopConfig={
            "WorkteamArn": workteamArn,
            "HumanTaskUiArn": humanTaskUiArn,
            "TaskTitle": "Image content moderation",
            "TaskDescription": "Review the image and instructions. Complete the task",
            "TaskCount": 1,
            "TaskAvailabilityLifetimeInSeconds": 43200,
            "TaskTimeLimitInSeconds": 3600,
            "TaskKeywords": [
                "content moderation",
            ],
        },
        OutputConfig={
            "S3OutputPath": "s3://amzn-s3-demo-bucket/prefix/",
        },
        RoleArn="arn:aws:iam::<account-number>:role/<role-name>",
        Tags=[
            {
                "Key": "string",
                "Value": "string"
            },
        ]
    )

Custom Integration

如果您使用自定义集成，请排除以下参数：HumanLoopRequestSource、HumanLoopActivationConfig。



    response = client.create_flow_definition(
        FlowDefinitionName="human-review-workflow-name",
        HumanLoopConfig={
            "WorkteamArn": workteamArn,
            "HumanTaskUiArn": humanTaskUiArn,
            "TaskTitle": "Image content moderation",
            "TaskDescription": "Review the image and instructions. Complete the task",
            "TaskCount": 1,
            "TaskAvailabilityLifetimeInSeconds": 43200,
            "TaskTimeLimitInSeconds": 3600,
            "TaskKeywords": [
                "content moderation",
            ],
        },
        OutputConfig={
            "S3OutputPath": "s3://amzn-s3-demo-bucket/prefix/",
        },
        RoleArn="arn:aws:iam::<account-number>:role/<role-name>",
        Tags=[
            {
                "Key": "string",
                "Value": "string"
            },
        ]
    )

创建人工审核工作流后，您可以从响应中检索流定义 ARN：



    humanReviewWorkflowArn = response["FlowDefinitionArn"]

创建人工循环

您用来启动人工循环的 API 操作，取决于您使用的 Amazon A2I 集成。

如果您使用 Amazon Textract 内置集成，则使用该操作。AnalyzeDocument
如果您使用 Amazon Rekognition 内置集成，则使用该操作。DetectModerationLabels
如果您使用自定义集成，则使用该StartHumanLoop操作。

在下表中选择您的任务类型，以查看使用适用于 Python (Boto3) 的 AWS SDK的 Amazon Textract 和 Amazon Rekognition 示例请求。

Amazon Textract – Key-value pair extraction

以下示例使用在 us-w 适用于 Python (Boto3) 的 AWS SDK est-2 analyze_document 中调用。使用您的资源替换斜体红色文本。如果您使用的是 Amazon Mechanical Turk 人力，请包括 DataAttributes 参数。有关更多信息，请参阅《AWS SDK for Python (Boto) API 参考》中的 analyze_document 文档。



   response = client.analyze_document(
         Document={"S3Object": {"Bucket": "amzn-s3-demo-bucket", "Name": "document-name.pdf"},
         HumanLoopConfig={
            "FlowDefinitionArn":"arn:aws:sagemaker:us-west-2:111122223333:flow-definition/flow-definition-name",
            "HumanLoopName":"human-loop-name",
            "DataAttributes" : {ContentClassifiers:["FreeOfPersonallyIdentifiableInformation"|"FreeOfAdultContent"]}
         }
         FeatureTypes=["FORMS"]
    )

只有当 Amazon Textract 文档分析任务的置信度满足您在人工审核工作流中指定的激活条件时，才会创建人工循环。您可以查看 response 元素来确定是否创建了人工循环。要查看此响应中包含的所有内容，请参阅 HumanLoopActivationOutput。



    if "HumanLoopArn" in analyzeDocumentResponse["HumanLoopActivationOutput"]:
        # A human loop has been started!
        print(f"A human loop has been started with ARN: {analyzeDocumentResponse["HumanLoopActivationOutput"]["HumanLoopArn"]}"

Amazon Rekognition – Image moderation

以下示例使用在 us-w 适用于 Python (Boto3) 的 AWS SDK est-2 detect_moderation_labels 中调用。使用您的资源替换斜体红色文本。如果您使用的是 Amazon Mechanical Turk 人力，请包括 DataAttributes 参数。有关更多信息，请参阅《AWS SDK for Python (Boto) API 参考》中的 detect_moderation_labels 文档。



   response = client.detect_moderation_labels(
            Image={"S3Object":{"Bucket": "amzn-s3-demo-bucket", "Name": "image-name.png"}},
            HumanLoopConfig={
               "FlowDefinitionArn":"arn:aws:sagemaker:us-west-2:111122223333:flow-definition/flow-definition-name",
               "HumanLoopName":"human-loop-name",
               "DataAttributes":{ContentClassifiers:["FreeOfPersonallyIdentifiableInformation"|"FreeOfAdultContent"]}
             }
    )

只有当 Amazon Rekognition 图像监管任务的置信度满足您在人工审核工作流中指定的激活条件时，才会创建人工循环。您可以查看 response 元素来确定是否创建了人工循环。要查看此响应中包含的所有内容，请参阅 HumanLoopActivationOutput。



    if "HumanLoopArn" in response["HumanLoopActivationOutput"]:
        # A human loop has been started!
        print(f"A human loop has been started with ARN: {response["HumanLoopActivationOutput"]["HumanLoopArn"]}")

Custom Integration

以下示例使用在 us-w 适用于 Python (Boto3) 的 AWS SDK est-2 start_human_loop 中调用。使用您的资源替换斜体红色文本。如果您使用的是 Amazon Mechanical Turk 人力，请包括 DataAttributes 参数。有关更多信息，请参阅《AWS SDK for Python (Boto) API 参考》中的 start_human_loop 文档。



   response = client.start_human_loop(
        HumanLoopName= "human-loop-name",
        FlowDefinitionArn= "arn:aws:sagemaker:us-west-2:111122223333:flow-definition/flow-definition-name",
        HumanLoopInput={"InputContent": inputContentJson},
        DataAttributes={"ContentClassifiers":["FreeOfPersonallyIdentifiableInformation"|"FreeOfAdultContent"]}
   )

此示例将输入内容存储在变量 inputContentJson 中。假设输入内容包含两个元素：文本模板和情绪（例如 Positive、Negative 或 Neutral)，它的格式如下：



    inputContent = {
        "initialValue": sentiment,
         "taskObject": blurb
     }

键 initialValue 和 taskObject 必须与工作人员任务模板的 liquid 元素中使用的键相对应。请参阅创建人工任务 UI 中的自定义模板以查看示例。

要创建 inputContentJson，请执行以下操作：



    import json
    
    inputContentJson = json.dumps(inputContent)

每次调用 start_human_loop 时会启动人工循环。要检查人工循环的状态，请使用 describe_human_loop：



    human_loop_info = a2i.describe_human_loop(HumanLoopName="human_loop_name")
    print(f"HumanLoop Status: {resp["HumanLoopStatus"]}")
    print(f"HumanLoop Output Destination: {resp["HumanLoopOutput"]}")

Javascript 在您的浏览器中被禁用或不可用。

要使用 Amazon Web Services 文档，必须启用 Javascript。请参阅浏览器的帮助页面以了解相关说明。

文档惯例

教程：开始使用 Amazon A2I 控制台

使用场景和示例