P6 Didukung DLAMIs - AWS Deep Learning AMIs

Terjemahan disediakan oleh mesin penerjemah. Jika konten terjemahan yang diberikan bertentangan dengan versi bahasa Inggris aslinya, utamakan versi bahasa Inggris.

P6 Didukung DLAMIs

Di bawah ini adalah persyaratan terperinci untuk menjalankan DLAMI di Instans Amazon P6-B200 dan Instans EC2 EC2 Amazon P6e- 00 GB2

P6-B200 Didukung DLAMIs

Berikut ini DLAMIs mendukung instans P6-B200:

DLAMI ini berisi perangkat lunak berikut yang diperlukan untuk mengoperasikan instance P6-B200:

Perangkat lunak

Persyaratan Versi Minimum

Kit Alat Nvidia CUDA

12.8

Pengemudi Nvidia

R570

NVLINK 5

R570

Kernel Linux

6.1

Adaptor Kain Elastis (EFA)

1.41.0

AWS Plugin OFI NCCL

1.15.0

P6e- 00 Didukung GB2 DLAMIs

Berikut ini DLAMIs mendukung contoh P6e- GB2 00:

DLAMI ini berisi perangkat lunak berikut yang diperlukan untuk mengoperasikan instance GB2 P6e-00:

Perangkat lunak

Persyaratan Versi Minimum

Kit Alat Nvidia CUDA

12.8

Pengemudi Nvidia

R570

Kernel Linux

6.12

Adaptor Kain Elastis (EFA)

1.42.0

AWS Plugin OFI NCCL

1.15.0

Konfirmasikan Fungsionalitas GPU

Untuk mengkonfirmasi fungsional GPUs:

  1. Jalankan Uji Kueri Perangkat GPU Nvidia berikut

    $ /usr/local/cuda/extras/demo_suite/deviceQuery
  2. Konfirmasikan output Berikut dari Device Query Run:

    $ /usr/local/cuda/extras/demo_suite/deviceQuery /usr/local/cuda/extras/demo_suite/deviceQuery Starting... CUDA Device Query (Runtime API) Detected 8 CUDA Capable device(s) ... deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.8, CUDA Runtime Version = 12.8, NumDevs = 8, Device0 = NVIDIA B200, Device1 = NVIDIA B200, Device2 = NVIDIA B200, Device3 = NVIDIA B200, Device4 = NVIDIA B200, Device5 = NVIDIA B200, Device6 = NVIDIA B200, Device7 = NVIDIA B200 Result = PASS

Untuk mengkonfirmasi driver NVIDIA fungsional:

  1. Jalankan Antarmuka Manajemen Sistem Nvidia

    $ nvidia-smi
  2. Konfirmasikan output Berikut dari Antarmuka Manajemen Sistem

    +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 570.133.20 Driver Version: 570.133.20 CUDA Version: 12.8 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA B200 Off | 00000000:51:00.0 Off | 0 | | N/A 32C P0 145W / 1000W | 0MiB / 183359MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ | 1 NVIDIA B200 Off | 00000000:52:00.0 Off | 0 | | N/A 30C P0 140W / 1000W | 0MiB / 183359MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ | 2 NVIDIA B200 Off | 00000000:62:00.0 Off | 0 | | N/A 31C P0 139W / 1000W | 0MiB / 183359MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ | 3 NVIDIA B200 Off | 00000000:63:00.0 Off | 0 | | N/A 29C P0 139W / 1000W | 0MiB / 183359MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ | 4 NVIDIA B200 Off | 00000000:75:00.0 Off | 0 | | N/A 31C P0 141W / 1000W | 0MiB / 183359MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ | 5 NVIDIA B200 Off | 00000000:76:00.0 Off | 0 | | N/A 31C P0 141W / 1000W | 0MiB / 183359MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ | 6 NVIDIA B200 Off | 00000000:86:00.0 Off | 0 | | N/A 32C P0 141W / 1000W | 0MiB / 183359MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ | 7 NVIDIA B200 Off | 00000000:87:00.0 Off | 0 | | N/A 30C P0 138W / 1000W | 0MiB / 183359MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+

Jika Anda mengalami masalah dengan instans P6-B200, silakan hubungi Support. AWS