Integraciones de codificador para Object2Vec - Amazon SageMaker

Las traducciones son generadas a través de traducción automática. En caso de conflicto entre la traducción y la version original de inglés, prevalecerá la version en inglés.

Integraciones de codificador para Object2Vec

Optimización de GPU: integraciones de codificador

Una integración es una asignación de objetos discretos, como palabras, a vectores de números reales.

Debido a la escasez de memoria de GPU, la variable de entorno INFERENCE_PREFERRED_MODE se puede especificar para su optimización en cuanto a si Formato de datos para inferencia de Object2Vec o la red de inferencia de integración de codificador se carga en GPU. Si la mayor parte de su inferencia está destinada a las integraciones de codificador, especifique INFERENCE_PREFERRED_MODE=embedding. A continuación se muestra un ejemplo de transformación por lotes de cómo utilizar 4 instancias de p3.2xlarge, que se optimiza para la inferencia de integración de codificador:

transformer = o2v.transformer(instance_count=4, instance_type="ml.p2.xlarge", max_concurrent_transforms=2, max_payload=1, # 1MB strategy='MultiRecord', env={'INFERENCE_PREFERRED_MODE': 'embedding'}, # only useful with GPU output_path=output_s3_path)

Entrada: integraciones de codificador

Tipo de contenido: application/json; infer_max_seqlens=<FWD-LENGTH>,<BCK-LENGTH>

Donde <FWD-LENGTH> y <BCK-LENGTH> son enteros en el intervalo [1,5000] y definen la longitud máxima de la secuencia para el codificador hacia adelante y hacia atrás.

{ "instances" : [ {"in0": [6, 17, 606, 19, 53, 67, 52, 12, 5, 10, 15, 10178, 7, 33, 652, 80, 15, 69, 821, 4]}, {"in0": [22, 1016, 32, 13, 25, 11, 5, 64, 573, 45, 5, 80, 15, 67, 21, 7, 9, 107, 4]}, {"in0": [774, 14, 21, 206]} ] }

Tipo de contenido: application/jsonlines; infer_max_seqlens=<FWD-LENGTH>,<BCK-LENGTH>

Donde <FWD-LENGTH> y <BCK-LENGTH> son enteros en el intervalo [1,5000] y definen la longitud máxima de la secuencia para el codificador hacia adelante y hacia atrás.

{"in0": [6, 17, 606, 19, 53, 67, 52, 12, 5, 10, 15, 10178, 7, 33, 652, 80, 15, 69, 821, 4]} {"in0": [22, 1016, 32, 13, 25, 11, 5, 64, 573, 45, 5, 80, 15, 67, 21, 7, 9, 107, 4]} {"in0": [774, 14, 21, 206]}

En ambos formatos, debe especificar solo un único tipo de entrada: “in0” o “in1.” Luego el servicio de inferencia invoca al codificador correspondiente y produce las integraciones para cada una de las instancias.

Salida: integraciones de codificador

Content-type: application/json

{ "predictions": [ {"embeddings":[0.057368703186511,0.030703511089086,0.099890425801277,0.063688032329082,0.026327300816774,0.003637571120634,0.021305780857801,0.004316598642617,0.0,0.003397724591195,0.0,0.000378780066967,0.0,0.0,0.0,0.007419463712722]}, {"embeddings":[0.150190666317939,0.05145975202322,0.098204270005226,0.064249359071254,0.056249320507049,0.01513972133398,0.047553978860378,0.0,0.0,0.011533712036907,0.011472506448626,0.010696629062294,0.0,0.0,0.0,0.008508535102009]} ] }

Content-type: application/jsonlines

{"embeddings":[0.057368703186511,0.030703511089086,0.099890425801277,0.063688032329082,0.026327300816774,0.003637571120634,0.021305780857801,0.004316598642617,0.0,0.003397724591195,0.0,0.000378780066967,0.0,0.0,0.0,0.007419463712722]} {"embeddings":[0.150190666317939,0.05145975202322,0.098204270005226,0.064249359071254,0.056249320507049,0.01513972133398,0.047553978860378,0.0,0.0,0.011533712036907,0.011472506448626,0.010696629062294,0.0,0.0,0.0,0.008508535102009]}

La longitud del vector de la salida de integraciones por el servicio de inferencia es igual al valor de uno de los hiperparámetros siguientes, que usted especifica en el momento de la capacitación: enc0_token_embedding_dim, enc1_token_embedding_dim o enc_dim.