有害音声検出機能の使用

一括書き起こしでの有害音声検出機能の使用

一括書き起こしで有害音声検出機能を使用するには、以下の例を参照してください。

AWS Management Console にサインインします。
ナビゲーションペインで、トランスクリプションの求人をクリックし、ジョブの作成(右上)。これにより、ジョブの詳細を指定 ページが開きます。
に仕事の詳細を指定ページ、必要に応じてPIIリダクションを有効にすることもできます。記載されている他のオプションは毒性検出ではサポートされていないことに注意してください。[Next] (次へ) を選択します。これにより、ジョブの設定-オプションページ。にオーディオ設定パネル、選択毒性検出。
[選択]ジョブの作成トランスクリプションジョブを実行します。
文字起こしジョブが完了したら、から文字起こしをダウンロードできます[ダウンロード]トランスクリプションジョブの詳細ページのドロップダウンメニュー。

この例では、start-transcription-jobコマンドとToxicityDetectionパラメーター。詳細については、「StartTranscriptionJob」と「ToxicityDetection」を参照してください。



aws transcribe start-transcription-job \
--region us-west-2 \
--transcription-job-name my-first-transcription-job \
--media MediaFileUri=s3://DOC-EXAMPLE-BUCKET/my-input-files/my-media-file.flac \
--output-bucket-name DOC-EXAMPLE-BUCKET \
--output-key my-output-files/ \
--language-code en-US \
--toxicity-detection ToxicityCategories=ALL

次は、を使用した別の例ですstart-transcription-jobコマンド、および毒性検出を含むリクエストボディ。



aws transcribe start-transcription-job \
--region us-west-2 \
--cli-input-json file://filepath/my-first-toxicity-job.json

ファイルmy-first-toxicity-job.jsonには、次のリクエストボディが含まれます。



{
  "TranscriptionJobName": "my-first-transcription-job",
  "Media": {
        "MediaFileUri": "s3://DOC-EXAMPLE-BUCKET/my-input-files/my-media-file.flac"
  },
  "OutputBucketName": "DOC-EXAMPLE-BUCKET",
  "OutputKey": "my-output-files/", 
  "LanguageCode": "en-US",
  "ToxicityDetection": [ 
      { 
         "ToxicityCategories": [ "ALL" ]
      }
   ]
}

この例では、AWS SDK for Python (Boto3)有効にするToxicityDetectionのためのトランスクリプションジョブの開始方法。詳細については、「StartTranscriptionJob」と「ToxicityDetection」を参照してください。

その他の使用例については、AWS機能別、シナリオ、クロスサービスの例を含む SDK については、SDK を使用した Amazon Transcribe のコード例 AWS SDKs章。



from __future__ import print_function
import time
import boto3
transcribe = boto3.client('transcribe', 'us-west-2')
job_name = "my-first-transcription-job"
job_uri = "s3://DOC-EXAMPLE-BUCKET/my-input-files/my-media-file.flac"
transcribe.start_transcription_job(
    TranscriptionJobName = job_name,
    Media = {
        'MediaFileUri': job_uri
    },
    OutputBucketName = 'DOC-EXAMPLE-BUCKET',
    OutputKey = 'my-output-files/', 
    LanguageCode = 'en-US', 
    ToxicityDetection = [ 
        { 
            'ToxicityCategories': ['ALL']
        }
    ]
)

while True:
    status = transcribe.get_transcription_job(TranscriptionJobName = job_name)
    if status['TranscriptionJob']['TranscriptionJobStatus'] in ['COMPLETED', 'FAILED']:
        break
    print("Not ready yet...")
    time.sleep(5)
print(status)

出力例

有害な音声はタグ付けされ、トランスクリプション出力で分類されます。有害な発話はそれぞれ分類され、信頼度スコア (0 ～ 1 の値) が割り当てられます。信頼値が大きいほど、その内容が指定されたカテゴリ内の有害な表現である可能性が高くなります。

以下は JSON 形式の出力例で、分類された不適切な表現とそれに関連する信頼度スコアを示しています。



{
    "jobName": "my-toxicity-job",
    "accountId": "111122223333",
    "results": {
        "transcripts": [...],
        "items":[...],
        "toxicity_detection": [
            {
                "text": "What the * are you doing man? That's why I didn't want to play with your * .  man it was a no, no I'm not calming down * man. I well I spent I spent too much * money on this game.",
                "toxicity": 0.7638,
                "categories": {
                    "profanity": 0.9913,
                    "hate_speech": 0.0382,
                    "sexual": 0.0016,
                    "insult": 0.6572,
                    "violence_or_threat": 0.0024,
                    "graphic": 0.0013,
                    "harassment_or_abuse": 0.0249
                },
                "start_time": 8.92,
                "end_time": 21.45
            },
            Items removed for brevity
            {
                "text": "What? Who? What the * did you just say to me? What's your address? What is your * address? I will pull up right now on your * * man. Take your * back to , tired of this **.",
                "toxicity": 0.9816,
                "categories": {
                    "profanity": 0.9865,
                    "hate_speech": 0.9123,
                    "sexual": 0.0037,
                    "insult": 0.5447,
                    "violence_or_threat": 0.5078,
                    "graphic": 0.0037,
                    "harassment_or_abuse": 0.0613
                },
                "start_time": 43.459,
                "end_time": 54.639
            },
        ]
    },
    ...
    "status": "COMPLETED"
}

ブラウザで JavaScript が無効になっているか、使用できません。

AWS ドキュメントを使用するには、JavaScript を有効にする必要があります。手順については、使用するブラウザのヘルプページを参照してください。

ドキュメントの表記規則

有害な発話の検知

トランスクリプトの編集