Embeddings Encoder untuk Object2Vec - Amazon SageMaker

Terjemahan disediakan oleh mesin penerjemah. Jika konten terjemahan yang diberikan bertentangan dengan versi bahasa Inggris aslinya, utamakan versi bahasa Inggris.

Embeddings Encoder untuk Object2Vec

Optimasi GPU: Embeddings Encoder

Embedding adalah pemetaan dari objek diskrit, seperti kata-kata, ke vektor bilangan real.

Karena kelangkaan memori GPU, variabel INFERENCE_PREFERRED_MODE lingkungan dapat ditentukan untuk mengoptimalkan apakah jaringan inferensi penyematan encoder Format Data untuk Inferensi Object2Vec atau dimuat ke dalam GPU. Jika sebagian besar inferensi Anda adalah untuk penyematan encoder, tentukan. INFERENCE_PREFERRED_MODE=embedding Berikut ini adalah contoh Transformasi Batch menggunakan 4 instance p3.2xlarge yang mengoptimalkan inferensi penyematan encoder:

transformer = o2v.transformer(instance_count=4, instance_type="ml.p2.xlarge", max_concurrent_transforms=2, max_payload=1, # 1MB strategy='MultiRecord', env={'INFERENCE_PREFERRED_MODE': 'embedding'}, # only useful with GPU output_path=output_s3_path)

Masukan: Embeddings Encoder

<FWD-LENGTH>Tipe konten: aplikasi/json; infer_max_seqlens=, <BCK-LENGTH>

Dimana <FWD-LENGTH>dan <BCK-LENGTH>merupakan bilangan bulat dalam rentang [1.5000] dan tentukan panjang urutan maksimum untuk encoder maju dan mundur.

{ "instances" : [ {"in0": [6, 17, 606, 19, 53, 67, 52, 12, 5, 10, 15, 10178, 7, 33, 652, 80, 15, 69, 821, 4]}, {"in0": [22, 1016, 32, 13, 25, 11, 5, 64, 573, 45, 5, 80, 15, 67, 21, 7, 9, 107, 4]}, {"in0": [774, 14, 21, 206]} ] }

<FWD-LENGTH>Tipe konten: aplikasi/jsonlines; infer_max_seqlens=, <BCK-LENGTH>

Dimana <FWD-LENGTH>dan <BCK-LENGTH>merupakan bilangan bulat dalam rentang [1.5000] dan tentukan panjang urutan maksimum untuk encoder maju dan mundur.

{"in0": [6, 17, 606, 19, 53, 67, 52, 12, 5, 10, 15, 10178, 7, 33, 652, 80, 15, 69, 821, 4]} {"in0": [22, 1016, 32, 13, 25, 11, 5, 64, 573, 45, 5, 80, 15, 67, 21, 7, 9, 107, 4]} {"in0": [774, 14, 21, 206]}

Dalam kedua format ini, Anda hanya menentukan satu jenis input: “in0” atau “in1.” Layanan inferensi kemudian memanggil encoder yang sesuai dan mengeluarkan embeddings untuk setiap instance.

Keluaran: Embeddings Encoder

Tipe konten: aplikasi/json

{ "predictions": [ {"embeddings":[0.057368703186511,0.030703511089086,0.099890425801277,0.063688032329082,0.026327300816774,0.003637571120634,0.021305780857801,0.004316598642617,0.0,0.003397724591195,0.0,0.000378780066967,0.0,0.0,0.0,0.007419463712722]}, {"embeddings":[0.150190666317939,0.05145975202322,0.098204270005226,0.064249359071254,0.056249320507049,0.01513972133398,0.047553978860378,0.0,0.0,0.011533712036907,0.011472506448626,0.010696629062294,0.0,0.0,0.0,0.008508535102009]} ] }

Tipe konten: aplikasi/jsonlines

{"embeddings":[0.057368703186511,0.030703511089086,0.099890425801277,0.063688032329082,0.026327300816774,0.003637571120634,0.021305780857801,0.004316598642617,0.0,0.003397724591195,0.0,0.000378780066967,0.0,0.0,0.0,0.007419463712722]} {"embeddings":[0.150190666317939,0.05145975202322,0.098204270005226,0.064249359071254,0.056249320507049,0.01513972133398,0.047553978860378,0.0,0.0,0.011533712036907,0.011472506448626,0.010696629062294,0.0,0.0,0.0,0.008508535102009]}

Panjang vektor output embeddings oleh layanan inferensi sama dengan nilai salah satu hyperparameter berikut yang Anda tentukan pada waktu pelatihan:,, atau. enc0_token_embedding_dim enc1_token_embedding_dim enc_dim