独自の推論コンテナを適応させる

構築済みの SageMaker Docker イメージを使用する Amazon にリストされているイメージをユースケース SageMaker に使用できない場合は、独自の Docker コンテナを構築し、そのコンテナを内でトレーニングや推論 SageMaker に使用できます。と互換性を持たせるには SageMaker、コンテナに次の特性が必要です。

コンテナには、ポートにウェブサーバーのリストが必要です8080。
コンテナは、 /invocationsおよび/pingリアルタイムエンドポイントへのPOSTリクエストを受け入れる必要があります。これらのエンドポイントに送信するリクエストは 60 秒で返され、最大サイズは 6 MB である必要があります。

でトレーニングと推論を行うために独自の Docker コンテナを構築する方法の詳細と例については SageMaker、「独自のアルゴリズムコンテナの構築」を参照してください。

次のガイドでは、Amazon SageMaker Studio Classic でJupyterLabスペースを使用して、推論コンテナを SageMaker ホスティングと連携させる方法を示します。この例では、NGINXウェブサーバーをPythonウェブサーバーゲートウェイインターフェイスGunicornとして、ウェブアプリケーションフレームワークFlaskとして使用します。前述の要件を満たしている限り、さまざまなアプリケーションを使用してコンテナを適応させることができます。独自の推論コードの使用の詳細については、「」を参照してくださいホスティングサービスでの独自の推論コードの使用。

推論コンテナを適応させる

SageMaker ホスティングと連携するように独自の推論コンテナを適応させるには、次のステップに従います。以下のステップで示す例では、 Pythonおよび以下に対して spaCy 自然言語処理 (NLP) ライブラリを使用する、事前トレーニング済みの名前付きエンティティ認識 (NER) モデルを使用しています。

NER モデルを含むコンテナを構築Dockerfileするための。
NER モデルを提供する推論スクリプト。

この例をユースケースに適応させる場合は、モデルのデプロイDockerfileと提供に必要なと推論スクリプトを使用する必要があります。

Amazon SageMaker Studio Classic で JupyterLab スペースを作成します (オプション）。

任意のノートブックを使用してスクリプトを実行し、推論コンテナを SageMaker ホスティングに適応させることができます。この例では、Amazon SageMaker Studio Classic 内のJupyterLabスペースを使用して、ディストリビューションイメージに付属 SageMakerするJupyterLabアプリケーションを起動する方法を示します。詳細については、「SageMaker JupyterLab」を参照してください。
Docker ファイルと推論スクリプトをアップロードします。
1. ホームディレクトリに新しいフォルダを作成します。を使用している場合はJupyterLab、左上隅で新しいフォルダアイコンを選択し、を含むフォルダ名を入力しますDockerfile。この例では、フォルダはと呼ばれますdocker_test_folder。
2. Dockerfile テキストファイルを新しいフォルダにアップロードします。以下は、spaCy から事前トレーニング済みの名前付きエンティティ認識 (NER) モデルを持つDockerコンテナDockerfileを作成する例です。この例を実行するために必要なアプリケーションと環境変数です。
```
FROM python:3.8

RUN apt-get -y update && apt-get install -y --no-install-recommends \
         wget \
         python3 \
         nginx \
         ca-certificates \
    && rm -rf /var/lib/apt/lists/*

RUN wget https://bootstrap.pypa.io/get-pip.py && python3 get-pip.py && \
    pip install flask gevent gunicorn && \
        rm -rf /root/.cache

#pre-trained model package installation
RUN pip install spacy
RUN python -m spacy download en


# Set environment variables
ENV PYTHONUNBUFFERED=TRUE
ENV PYTHONDONTWRITEBYTECODE=TRUE
ENV PATH="/opt/program:${PATH}"

COPY NER /opt/program
WORKDIR /opt/program
```
  前のコード例では、環境変数は標準出力ストリームをバッファリングPythonしないようにPYTHONUNBUFFEREDし、ユーザーへのログの配信を高速化します。環境変数は、コンパイルされたバイトコード.pycファイルを書きPython込まPYTHONDONTWRITEBYTECODEないようにします。これは、このユースケースには必要ありません。環境変数PATHは、コンテナが呼び出されたときに trainおよび serve プログラムの場所を識別するために使用されます。
3. 新しいフォルダ内に新しいディレクトリを作成し、モデルを提供するスクリプトを含めます。この例では、というディレクトリを使用します。このディレクトリにはNER、この例を実行するために必要な以下のスクリプトが含まれています。
  - predictor.py – モデルをロードして推論を実行するロジックを含むPythonスクリプト。
  - nginx.conf – ウェブサーバーを設定するスクリプト。
  - serve – 推論サーバーを起動するスクリプト。
  - wsgi.py – モデルを提供するヘルパースクリプト。
  重要
  推論スクリプトをで終わるノートブックにコピーして名前を変更する.ipynbと、スクリプトにフォーマット文字が含まれている可能性があり、エンドポイントのデプロイが妨げられます。代わりに、テキストファイルを作成して名前を変更します。
4. スクリプトをアップロードして、モデルを推論に使用できるようにします。以下は、 predictor.pyが /pingエンドポイントと /invocationsエンドポイントを提供するために使用する Flask と呼ばれるスクリプトの例です。
```
from flask import Flask
import flask
import spacy
import os
import json
import logging

#Load in model
nlp = spacy.load('en_core_web_sm') 
#If you plan to use a your own model artifacts, 
#your model artifacts should be stored in /opt/ml/model/ 


# The flask app for serving predictions
app = Flask(__name__)
@app.route('/ping', methods=['GET'])
def ping():
    # Check if the classifier was loaded correctly
    health = nlp is not None
    status = 200 if health else 404
    return flask.Response(response= '\n', status=status, mimetype='application/json')


@app.route('/invocations', methods=['POST'])
def transformation():
    
    #Process input
    input_json = flask.request.get_json()
    resp = input_json['input']
    
    #NER
    doc = nlp(resp)
    entities = [(X.text, X.label_) for X in doc.ents]

    # Transform predictions to JSON
    result = {
        'output': entities
        }

    resultjson = json.dumps(result)
    return flask.Response(response=resultjson, status=200, mimetype='application/json')
```
  前のスクリプト例の/pingエンドポイントは、モデルが正しくロード200された場合、およびモデルが正しくロードされなかった404場合、のステータスコードを返します。/invocations エンドポイントは、でフォーマットされたリクエストを処理しJSON、入力フィールドを抽出し、NERモデルを使用して変数エンティティ内のエンティティを識別して保存します。Flask アプリケーションは、これらのエンティティを含むレスポンスを返します。これらの必須ヘルスリクエストの詳細については、「」を参照してくださいコンテナがヘルスチェック (Ping) リクエストに応答する方法。
5. スクリプトをアップロードして推論サーバーを起動します。次のスクリプト例では、をアプリケーションサーバーGunicornとして、をウェブサーバーNginxとしてserve呼び出します。
```
#!/usr/bin/env python

# This file implements the scoring service shell. You don't necessarily need to modify it for various
# algorithms. It starts nginx and gunicorn with the correct configurations and then simply waits until
# gunicorn exits.
#
# The flask server is specified to be the app object in wsgi.py
#
# We set the following parameters:
#
# Parameter                Environment Variable              Default Value
# ---------                --------------------              -------------
# number of workers        MODEL_SERVER_WORKERS              the number of CPU cores
# timeout                  MODEL_SERVER_TIMEOUT              60 seconds

import multiprocessing
import os
import signal
import subprocess
import sys

cpu_count = multiprocessing.cpu_count()

model_server_timeout = os.environ.get('MODEL_SERVER_TIMEOUT', 60)
model_server_workers = int(os.environ.get('MODEL_SERVER_WORKERS', cpu_count))

def sigterm_handler(nginx_pid, gunicorn_pid):
    try:
        os.kill(nginx_pid, signal.SIGQUIT)
    except OSError:
        pass
    try:
        os.kill(gunicorn_pid, signal.SIGTERM)
    except OSError:
        pass

    sys.exit(0)

def start_server():
    print('Starting the inference server with {} workers.'.format(model_server_workers))


    # link the log streams to stdout/err so they will be logged to the container logs
    subprocess.check_call(['ln', '-sf', '/dev/stdout', '/var/log/nginx/access.log'])
    subprocess.check_call(['ln', '-sf', '/dev/stderr', '/var/log/nginx/error.log'])

    nginx = subprocess.Popen(['nginx', '-c', '/opt/program/nginx.conf'])
    gunicorn = subprocess.Popen(['gunicorn',
                                 '--timeout', str(model_server_timeout),
                                 '-k', 'sync',
                                 '-b', 'unix:/tmp/gunicorn.sock',
                                 '-w', str(model_server_workers),
                                 'wsgi:app'])

    signal.signal(signal.SIGTERM, lambda a, b: sigterm_handler(nginx.pid, gunicorn.pid))

    # Exit the inference server upon exit of either subprocess
    pids = set([nginx.pid, gunicorn.pid])
    while True:
        pid, _ = os.wait()
        if pid in pids:
            break

    sigterm_handler(nginx.pid, gunicorn.pid)
    print('Inference server exiting')

# The main routine to invoke the start function.

if __name__ == '__main__':
    start_server()
```
  前のスクリプト例では、シグナルハンドラー関数を定義しています。これによりsigterm_handler、SIGTERMシグナルを受信するNginxとおよび Gunicornサブプロセスがシャットダウンされます。start_server 関数は、シグナルハンドラーを起動し、 Nginxおよび Gunicornサブプロセスを起動してモニタリングし、ログストリームをキャプチャします。
6. スクリプトをアップロードしてウェブサーバーを設定します。という名前の次のスクリプト例ではnginx.conf、をアプリケーションサーバーGunicornとして使用して推論用のモデルを提供するNginxウェブサーバーを設定します。
```
worker_processes 1;
daemon off; # Prevent forking


pid /tmp/nginx.pid;
error_log /var/log/nginx/error.log;

events {
  # defaults
}

http {
  include /etc/nginx/mime.types;
  default_type application/octet-stream;
  access_log /var/log/nginx/access.log combined;
  
  upstream gunicorn {
    server unix:/tmp/gunicorn.sock;
  }

  server {
    listen 8080 deferred;
    client_max_body_size 5m;

    keepalive_timeout 5;
    proxy_read_timeout 1200s;

    location ~ ^/(ping|invocations) {
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header Host $http_host;
      proxy_redirect off;
      proxy_pass http://gunicorn;
    }

    location / {
      return 404 "{}";
    }
  }
}
```
  前のスクリプト例では、をフォアグラウンドで実行Nginxするように設定し、をキャプチャする場所を設定しerror_log、をGunicornサーバーのソケットソックupstreamとして定義します。サーバーは、ポートをリッスンするようにサーバーブロックを設定し8080、クライアントリクエストボディのサイズとタイムアウト値に制限を設定します。サーバーブロックは、 /pingまたは /invocationsパスを含むリクエストを Gunicorn に転送しserver http://gunicorn、他のパスの404エラーを返します。
7. モデルの提供に必要なその他のスクリプトをアップロードします。この例では、アプリケーションGunicornの検索に役立つ wsgi.py というスクリプト例が必要です。
```
import predictor as myapp

# This is just a simple wrapper for gunicorn to find your app.
# If you want to change the algorithm file, simply change "predictor" above to the
# new file.

app = myapp.app
```
フォルダからdocker_test_folder、ディレクトリ構造には Dockerfileとフォルダが含まれている必要がありますNER。NER フォルダには、wsgi.py次のように nginx.conf、predictor.py、serve、およびファイルが含まれている必要があります。
独自のコンテナを構築します。

フォルダからdocker_test_folder、Dockerコンテナを構築します。次のコマンド例では、で設定されたDockerコンテナを構築しますDockerfile。
```
! docker build -t byo-container-test .
```
前のコマンドは、現在の作業ディレクトリbyo-container-testにという名前のコンテナを構築します。Docker ビルドパラメータの詳細については、「ビルド引数」を参照してください。
注記
がDocker見つからない次のエラーメッセージが表示された場合はDockerfile、に正しい名前Dockerfileが付けられ、ディレクトリに保存されていることを確認してください。
```
unable to prepare context: unable to evaluate symlinks in Dockerfile path:
lstat /home/ec2-user/SageMaker/docker_test_folder/Dockerfile: no such file or directory
```
Docker は、現在のディレクトリ内で拡張Dockerfile子なしで、という名前のファイルを検索します。別の名前を付けた場合は、-f フラグを使用して手動でファイル名を渡すことができます。例えば、を Dockerfileと名付けた場合Dockerfile-text.txt、次のように -fフラグの後にファイルを使用してDockerコンテナを構築します。
```
! docker build -t byo-container-test -f Dockerfile-text.txt .
```

Amazon Elastic Container Registry (Amazon ECR) にDockerイメージをプッシュする

ノートブックセルで、Dockerイメージを ECR にプッシュします。次のコード例は、コンテナをローカルで構築し、ログインして ECR にプッシュする方法を示しています。


%%sh
# Name of algo -> ECR
algorithm_name=sm-pretrained-spacy

#make serve executable
chmod +x NER/serve
account=$(aws sts get-caller-identity --query Account --output text)
# Region, defaults to us-west-2
region=$(aws configure get region)
region=${region:-us-east-1}
fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"
# If the repository doesn't exist in ECR, create it.
aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1
if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/nullfi
# Get the login command from ECR and execute it directly
aws ecr get-login-password --region ${region}|docker login --username AWS --password-stdin ${fullname}
# Build the docker image locally with the image name and then push it to ECR
# with the full name.

docker build  -t ${algorithm_name} .
docker tag ${algorithm_name} ${fullname}

docker push ${fullname}

前の例では、サンプル Docker コンテナを ECR にプッシュするために必要な次のステップを実行する方法を示しています。

アルゴリズム名をとして定義しますsm-pretrained-spacy。
NER フォルダ内の serveファイルを実行可能にします。
を設定します AWS リージョン。
ECR がまだ存在しない場合は作成します。
ECR にログインします。
Docker コンテナをローカルに構築します。
Docker イメージを ECR にプッシュします。

SageMaker クライアントをセットアップする

推論に SageMaker ホスティングサービスを使用する場合は、モデルを作成し、エンドポイント設定を作成し、エンドポイントを作成する必要があります。エンドポイントから推論を取得するには、ランタイムクライアントを使用してエンドポイントを SageMaker boto3呼び出すことができます。次のコードは、SageMaker boto3 クライアントを使用してクライアントと SageMaker ランタイムクライアントの両方 SageMakerを設定する方法を示しています。
```
import boto3
from sagemaker import get_execution_role

sm_client = boto3.client(service_name='sagemaker')
runtime_sm_client = boto3.client(service_name='sagemaker-runtime')

account_id = boto3.client('sts').get_caller_identity()['Account']
region = boto3.Session().region_name

#used to store model artifacts which SageMaker will extract to /opt/ml/model in the container, 
#in this example case we will not be making use of S3 to store the model artifacts
#s3_bucket = '<S3Bucket>'

role = get_execution_role()
```
前のコード例では、Amazon S3 バケットは使用されませんが、モデルアーティファクトの保存方法を示すコメントとして挿入されています。

前のコード例を実行した後にアクセス許可エラーが表示された場合は、IAM ロールにアクセス許可を追加する必要がある場合があります。IAM ロールの詳細については、「Amazon SageMaker Role Manager」を参照してください。現在のロールにアクセス許可を追加する方法の詳細については、「」を参照してくださいAWS Amazon のマネージドポリシー SageMaker。
モデルを作成します。

推論に SageMaker ホスティングサービスを使用する場合は、でモデルを作成する必要があります SageMaker。次のコード例は、内でspaCyNERモデルを作成する方法を示しています SageMaker。
```
from time import gmtime, strftime

model_name = 'spacy-nermodel-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
# MODEL S3 URL containing model atrifacts as either model.tar.gz or extracted artifacts. 
# Here we are not  
#model_url = 's3://{}/spacy/'.format(s3_bucket) 

container = '{}.dkr.ecr.{}.amazonaws.com/sm-pretrained-spacy:latest'.format(account_id, region)
instance_type = 'ml.c5d.18xlarge'

print('Model name: ' + model_name)
#print('Model data Url: ' + model_url)
print('Container image: ' + container)

container = {
'Image': container
}

create_model_response = sm_client.create_model(
    ModelName = model_name,
    ExecutionRoleArn = role,
    Containers = [container])

print("Model Arn: " + create_model_response['ModelArn'])
```
前のコード例は、ステップ 5 のコメントから Amazon S3 バケットを使用する場合s3_bucketに model_urlを使用してを定義する方法を示し、コンテナイメージの ECR URI を定義します。前のコード例では、をインスタンスタイプml.c5d.18xlargeとして定義しています。別のインスタンスタイプを選択することもできます。使用可能なインスタンスタイプの詳細については、Amazon EC2 インスタンスタイプ」を参照してください。

前のコード例では、 Imageキーはコンテナイメージ URI を指します。create_model_response 定義はを使用してモデルcreate_model methodを作成し、モデル名、ロール、コンテナ情報を含むリストを返します。

前のスクリプトからの出力例を次に示します。
```
Model name: spacy-nermodel-YYYY-MM-DD-HH-MM-SS
Model data Url: s3://spacy-sagemaker-us-east-1-bucket/spacy/
Container image: 123456789012.dkr.ecr.us-east-2.amazonaws.com/sm-pretrained-spacy:latest
Model Arn: arn:aws:sagemaker:us-east-2:123456789012:model/spacy-nermodel-YYYY-MM-DD-HH-MM-SS
```
1. エンドポイントの設定と作成
  
  推論に SageMaker ホスティングを使用するには、エンドポイントも設定して作成する必要があります。 SageMaker は推論にこのエンドポイントを使用します。次の設定例は、前に定義したインスタンスタイプとモデル名を使用してエンドポイントを生成および設定する方法を示しています。
```
endpoint_config_name = 'spacy-ner-config' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print('Endpoint config name: ' + endpoint_config_name)

create_endpoint_config_response = sm_client.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[{
        'InstanceType': instance_type,
        'InitialInstanceCount': 1,
        'InitialVariantWeight': 1,
        'ModelName': model_name,
        'VariantName': 'AllTraffic'}])
        
print("Endpoint config Arn: " + create_endpoint_config_response['EndpointConfigArn'])
```
  前の設定例では、 endpoint_config_nameはタイムスタンプmodel_nameで作成された一意のエンドポイント設定名にをcreate_endpoint_config_response関連付けます。
  
  前のスクリプトからの出力例を次に示します。
```
Endpoint config name: spacy-ner-configYYYY-MM-DD-HH-MM-SS
Endpoint config Arn: arn:aws:sagemaker:us-east-2:123456789012:endpoint-config/spacy-ner-config-MM-DD-HH-MM-SS
```
  エンドポイントエラーの詳細については、 SageMaker 「エンドポイントを作成または更新するときに Amazon エンドポイントが失敗状態になるのはなぜですか？」を参照してください。
2. エンドポイントを作成し、エンドポイントが稼働するのを待ちます。
  
  次のコード例では、前の設定例の設定を使用してエンドポイントを作成し、モデルをデプロイします。
```
%%time

import time

endpoint_name = 'spacy-ner-endpoint' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print('Endpoint name: ' + endpoint_name)

create_endpoint_response = sm_client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name)
print('Endpoint Arn: ' + create_endpoint_response['EndpointArn'])

resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
status = resp['EndpointStatus']
print("Endpoint Status: " + status)

print('Waiting for {} endpoint to be in service...'.format(endpoint_name))
waiter = sm_client.get_waiter('endpoint_in_service')
waiter.wait(EndpointName=endpoint_name)
```
  前のコード例では、 create_endpointメソッドは前のコード例で作成した生成されたエンドポイント名でエンドポイントを作成し、エンドポイントの Amazon リソースネームを出力します。describe_endpoint メソッドは、エンドポイントとそのステータスに関する情報を返します。 SageMaker ウェイターは、エンドポイントが稼働するのを待ちます。
エンドポイントをテストします。

エンドポイントが稼働状態になったら、エンドポイントに呼び出しリクエストを送信します。次のコード例は、エンドポイントにテストリクエストを送信する方法を示しています。
```
import json
content_type = "application/json"
request_body = {"input": "This is a test with NER in America with \
    Amazon and Microsoft in Seattle, writing random stuff."}

#Serialize data for endpoint
#data = json.loads(json.dumps(request_body))
payload = json.dumps(request_body)

#Endpoint invocation
response = runtime_sm_client.invoke_endpoint(
EndpointName=endpoint_name,
ContentType=content_type,
Body=payload)

#Parse results
result = json.loads(response['Body'].read().decode())['output']
result
```
前のコード例では、メソッドは request_bodyを JSON 形式の文字列にjson.dumpsシリアル化し、変数ペイロードに保存します。次に SageMaker 、ランタイムクライアントはエンドポイント呼び出しメソッドを使用して、エンドポイントにペイロードを送信します。結果には、出力フィールドを抽出した後のエンドポイントからのレスポンスが含まれます。

前のコード例では、次の出力が返されます。
```
[['NER', 'ORG'],
 ['America', 'GPE'],
 ['Amazon', 'ORG'],
 ['Microsoft', 'ORG'],
 ['Seattle', 'GPE']]
```
エンドポイントを削除する

呼び出しが完了したら、エンドポイントを削除してリソースを節約します。次のコード例は、エンドポイントを削除する方法を示しています。
```
sm_client.delete_endpoint(EndpointName=endpoint_name)
sm_client.delete_endpoint_config(EndpointConfigName=endpoint_config_name)
sm_client.delete_model(ModelName=model_name)
```
この例のコードを含む完全なノートブックについては、「BYOC-Single-Model」を参照してください。

コンテナデプロイのトラブルシューティング

エンドポイントがデプロイされなかった場合は、Amazon CloudWatch Events ログを次のように確認します。

https://console.aws.amazon.com/sagemaker/ SageMaker console ナビゲーションペインから、推論を選択します。
[推論] で、[エンドポイント] を選択します。
名前でエンドポイントを検索し、エンドポイントの名前をクリックします。この例では、名前は命名規則に従いますspacy-ner-configYYYY-MM-DD-HH-MM-SS。
エンドポイントの概要 で、モデルコンテナログ の下にあるリンクを選択します。
ログストリームボックスで最新のログストリームを選択します。

次のリストを使用して、エンドポイントのデプロイをトラブルシューティングします。さらにサポートが必要な場合は、Amazon AWS のサポートまたはデベロッパーフォーラムにお問い合わせください。 AWS SageMaker

トピック

名前エラー
クォータが不十分
アップストリームタイムアウトエラー

名前エラー

ログの状態がの場合はNameError: name 'null' is not defined、で終わるノートブックにスクリプトが作成されていないことを確認し.ipnyb、名前をなどの別のファイル名に変更しますDockerfile。ノートブックを作成すると、書式設定文字によってエンドポイントのデプロイが妨げられる可能性があります。このエラーが発生してスクリプトを変更して修正する場合は、変更を有効にするためにカーネルを再起動する必要がある場合があります。

クォータが不十分

ResourceLimitExceeded エラーが発生した場合は、次のように追加のクォータをリクエストする必要があります。

AWS Service Quotas の引き上げをリクエストする

画面上のエラーメッセージからインスタンス名、現在のクォータ、および必要なクォータを取得します。例えば、次のサンプルエラーでは、
- インスタンス名はですml.c5d.18xlarge。
- 次の数値の現在のクォータは current utilizationです1 instances。
- 次の番号から追加の必須クォータは request deltaです1 instances。
サンプルエラーは次のとおりです。
```
ResourceLimitExceeded: An error occurred (ResourceLimitExceeded) 
when calling the CreateEndpoint operation: The account-level service limit
'ml.c5d.18xlarge for endpoint usage' is 1 Instances, with current utilization 
of 1 Instances and a request delta of 1 Instances. Please use AWS Service Quotas
to request an increase for this quota. If AWS Service Quotas is not available, 
contact AWS support to request an increase for this quota.
```
にサインイン AWS Management Console し、Service Quotas コンソールを開きます。
ナビゲーションペインの「クォータの管理」で、Amazon を入力します SageMaker。
クォータの表示 を選択します。
サービスクォータの検索バーに、ステップ 1 のインスタンスの名前を入力します。例えば、ステップ 1 のエラーメッセージに含まれる情報を使用して、を入力しますml.c5d.18xlarge。
インスタンス名の横にあるクォータ名を選択し、エンドポイントの使用のために で終わります。例えば、ステップ 1 のエラーメッセージに含まれる情報を使用して、エンドポイントの使用ml.g5.12xlargeにを選択します。
アカウントレベルで引き上げをリクエストを選択します。
「クォータ値を増やす」で、ステップ 1 のエラーメッセージに記載されている情報から必要なクォータを入力します。current utilization と の合計を入力しますrequest delta。前のエラー例では、 current utilizationはで1 Instances、 request deltaはです1 Instances。この例では、のクォータをリクエスト2して、必要なクォータを指定します。
[リクエスト] を選択します。
ナビゲーションペインからクォータリクエスト履歴を選択します。
ステータスが「保留中」から「承認済み」に変わったら、ジョブを再実行します。変更を確認するには、ブラウザを更新する必要がある場合があります。

クォータの引き上げのリクエストの詳細については、「クォータの引き上げのリクエスト」を参照してください。

アップストリームタイムアウトエラー

upstream timed out (110: Connection timed out) エラーが表示された場合は、以下を試すことができます。

コンテナのレイテンシーを短縮するか、コンテナのタイムアウト制限を引き上げます。 SageMaker では、コンテナが 60 秒以内にリクエストに応答する必要があります。
ウェブサーバーがモデルからのレスポンスを待機するまでの時間を長くします。

タイムアウトエラーの詳細については、「アップストリームからレスポンスヘッダーを読み取るときに「アップストリームがタイムアウト (110: 接続がタイムアウト）」という Amazon SageMaker 推論エラーを解決するにはどうすればよいですか？」を参照してください。

ブラウザで JavaScript が無効になっているか、使用できません。

AWS ドキュメントを使用するには、JavaScript を有効にする必要があります。手順については、使用するブラウザのヘルプページを参照してください。

ドキュメントの表記規則

トレーニングに認証が必要な Docker レジストリの使用

独自のアルゴリズムとモデルでコンテナを作成する