Hugging Face Transformer モデルのサポート

PDF

RSS

フォーカスモード

Hugging Face Transformer モデルのサポート - Amazon SageMaker AI

すぐに利用可能なサポート対象モデル

SageMaker モデル並列処理ライブラリのテンソル並列処理は、次の Hugging Face Transformer モデルに対するすぐに利用可能なサポートを提供します。

GPT-2、BERT、および RoBERTa (SageMaker モデル並列処理ライブラリ v1.7.0 以降で利用可能)
GPT-J (SageMaker モデル並列処理ライブラリ v1.8.0 以降で利用可能)
GPT-Neo (SageMaker モデル並列処理ライブラリ v1.10.0 以降で利用可能)

注記

その他の Transformer モデルでは、smdistributed.modelparallel.torch.tp_register_with_module() API を使用してテンソル並列処理を適用する必要があります。

注記

Hugging Face Transformer モデルのトレーニングにテンソル並列処理を使用するには、必ず SageMaker モデル並列処理ライブラリ v1.7.0 以降を含む PyTorch 用 Hugging Face Deep Learning Containers を使用してください。詳細については、SageMaker モデル並列処理ライブラリのリリースノートを参照してください。

すぐに利用可能なサポート対象モデル

ライブラリですぐに利用可能なサポート対象の Hugging Face Transformer モデルの場合、Transformer API を smdistributed トランスフォーマーレイヤーに変換するためのフックを手動で実装する必要はありません。コンテキストマネージャー smdistributed.modelparallel.torch.tensor_parallelism() を使用して、smdistributed.modelparallel.torch.DistributedModel() でモデルをラップすることで、テンソル並列処理を有効化することもできます。また、smp.tp_register API を使用してテンソル並列処理のフックを手動で登録する必要もありません。

Hugging Face Transformer と smdistributed.modelparallel の state_dict 変換関数は以下のようにアクセスできます。

smdistributed.modelparallel.torch.nn.huggingface.gpt2.translate_state_dict_to_hf_gpt2(state_dict, max_seq_len=None)
smdistributed.modelparallel.torch.nn.huggingface.gpt2.translate_hf_state_dict_to_smdistributed_gpt2(state_dict)
smdistributed.modelparallel.torch.nn.huggingface.bert.translate_state_dict_to_hf_bert(state_dict, max_seq_len=None)
smdistributed.modelparallel.torch.nn.huggingface.bert.translate_hf_state_dict_to_smdistributed_bert(state_dict)
smdistributed.modelparallel.torch.nn.huggingface.roberta.translate_state_dict_to_hf_roberta(state_dict, max_seq_len=None)
smdistributed.modelparallel.torch.nn.huggingface.roberta.translate_hf_state_dict_to_smdistributed_roberta(state_dict)
smdistributed.modelparallel.torch.nn.huggingface.gptj.translate_state_dict_to_hf_gptj(state_dict, max_seq_len=None) (SageMaker モデル並列処理ライブラリ v1.8.0 以降で利用可能)
smdistributed.modelparallel.torch.nn.huggingface.gptj.translate_hf_gptj_state_dict_to_smdistributed_gptj (SageMaker モデル並列処理ライブラリ v1.8.0 以降で利用可能)
smdistributed.modelparallel.torch.nn.huggingface.gptneo.translate_state_dict_to_hf_gptneo(state_dict, max_seq_len=None) (SageMaker モデル並列処理ライブラリ v1.10.0 以降で利用可能)
smdistributed.modelparallel.torch.nn.huggingface.gptneo.translate_hf_state_dict_to_smdistributed_gptneo(state_dict) (SageMaker モデル並列処理ライブラリ v1.10.0 以降で利用可能)

GPT-2 変換関数の使用例

次のコードに示すようにモデルをラッピングすることから始めます。


from transformers import AutoModelForCausalLM

with smp.tensor_parallelism():
    model = AutoModelForCausalLM.from_config(hf_gpt2_config)

model = smp.DistributedModel(model)

DistributedModelオブジェクトから state_dict を指定すると、次のコードに示すような translate_state_dict_to_hf_gpt2 関数を使用して、元の Hugging Face GPT-2 モデルに重みをロードできます。


from smdistributed.modelparallel.torch.nn.huggingface.gpt2 \
                                      import translate_state_dict_to_hf_gpt2
max_seq_len = 1024

# [... code block for training ...]

if smp.rdp_rank() == 0:
    state_dict = dist_model.state_dict()
    hf_state_dict = translate_state_dict_to_hf_gpt2(state_dict, max_seq_len)

    # can now call model.load_state_dict(hf_state_dict) to the original HF model

RoBERTa 変換関数の使用例

同様に、サポート対象の HuggingFace モデル state_dict では、translate_hf_state_dict_to_smdistributed 関数を使用し、それを smp.DistributedModel で読み取り可能な形式に変換することができます。これは、事前学習済みのモデルがモデル並列微調整のために smp.DistributedModel にロードされる転移学習のユースケースで役立つ場合があります。


from smdistributed.modelparallel.torch.nn.huggingface.roberta \
                                      import translate_state_dict_to_smdistributed

model = AutoModelForMaskedLM.from_config(roberta_config)
model = smp.DistributedModel(model)

pretrained_model = AutoModelForMaskedLM.from_pretrained("roberta-large")
translated_state_dict =
        translate_state_dict_to_smdistributed(pretrained_model.state_dict())

# load the translated pretrained weights into the smp.DistributedModel
model.load_state_dict(translated_state_dict)

# start fine-tuning...

ブラウザで JavaScript が無効になっているか、使用できません。

AWS ドキュメントを使用するには、JavaScript を有効にする必要があります。手順については、使用するブラウザのヘルプページを参照してください。

ドキュメントの表記規則

テンソル並列処理を使用してトレーニングジョブを実行する

ランキングメカニズム

次のトピック

ランキングメカニズム

前のトピック:

テンソル並列処理を使用してトレーニングジョブを実行する

ヘルプが必要ですか?

このページの内容

Cookie の設定を選択する

Cookie の設定をカスタマイズする

Essential

Performance

Functional

Advertising

Cookie の設定を保存できません

Hugging Face Transformer モデルのサポート

注記

注記

すぐに利用可能なサポート対象モデル

次のトピック

前のトピック:

ヘルプが必要ですか?

このページの内容

Related resources

このページは役に立ちましたか?

Related resources