TensorFlow with Horovod - Deep Learning AMI

TensorFlow with Horovod

This tutorial shows how to activate TensorFlow with Horovod on an AWS Deep Learning AMI (DLAMI) with Conda. Horovod is pre-installed in the Conda environments for TensorFlow. The Python3 environment is recommended.

Note

Only P3.*, P2.*, and G3.* instance types are supported.

To activate TensorFlow and test Horovod on the DLAMI with Conda

  1. Open an Amazon Elastic Compute Cloud (Amazon EC2) instance of the DLAMI with Conda. For help getting started with a DLAMI, see How to Get Started with the DLAMI.

  2. (Recommended) For TensorFlow 1.15 with Horovod on Python 3 with CUDA 11, run the following command:

    $ source activate tensorflow_p37
  3. Start the iPython terminal:

    (tensorflow_p37)$ ipython
  4. Test importing TensorFlow with Horovod to verify that it's working properly:

    import horovod.tensorflow as hvd hvd.init()

    The following may appear on your screen (you may ignore any warning messages).

    -------------------------------------------------------------------------- [[55425,1],0]: A high-performance Open MPI point-to-point messaging module was unable to find any relevant network interfaces: Module: OpenFabrics (openib) Host: ip-172-31-72-4 Another transport will be used instead, although this may result in lower performance. --------------------------------------------------------------------------

More Info