Using the SageMaker HyperPod console UI - Amazon SageMaker

Using the SageMaker HyperPod console UI

The following topics provide guidance on how to operate SageMaker HyperPod through the console UI.

Create a SageMaker HyperPod cluster

See the following instructions on creating a new SageMaker HyperPod cluster through the SageMaker HyperPod console UI.

  1. Open the Amazon SageMaker console at https://console.aws.amazon.com/sagemaker/.

  2. Choose HyperPod Clusters in the left navigation pane.

  3. In the SageMaker HyperPod landing page, choose Create cluster.

  4. In Step 1: Cluster settings, set up basic information for the cluster.

    1. For Cluster name, specify a name for the new cluster.

    2. For Tags, add key and value pairs to the new cluster and manage the cluster as an AWS resource. To learn more, see Tagging your AWS resources.

  5. In Step 2: Instance groups, choose Create instance group. Each instance group can be configured differently, and you can create a heterogeneous cluster that consists of multiple instance groups with various instance types. In the Create an instance group configuration pop-up window, fill the instance group configuration information.

    1. For Instance group name, specify a name for the instance group.

    2. For Select instance type, choose the instance for the instance group.

    3. For Quantity, specify an integer not exceeding the instance quota for cluster usage.

    4. For Amazon S3 path to lifecycle script files, enter the S3 path in which your lifecycle scripts are stored.

    5. For Directory path to your on-create lifecycle script, enter the file name of the lifecycle script under S3 path to lifecycle script files.

    6. For IAM role, choose the IAM role you have created for SageMaker HyperPod resources, following the section Set up IAM users and roles for SageMaker HyperPod users and resources.

    7. Under Advanced configuration, you can set up the following optional configurations.

      1. (Optional) For Threads per core, specify 1 for disabling multi-threading and 2 for enabling multi-threading. To find which instance type supports multi-threading, see the reference table of CPU cores and threads per CPU core per instance type in the Amazon EC2 User Guide.

      2. (Optional) For Additional instance storage configs, specify an integer between 1 and 16384 to set the size of an additional Elastic Block Store (EBS) volume in gigabytes (GB). The EBS volume is attached to each instance of the instance group. The default mount path for the additional EBS volume is /opt/sagemaker. After the cluster is succefully created, you can SSH into the cluster instances (nodes) and verify if the EBS volume is mounted correctly by running the df -h command. Attaching an additional EBS volume provides stable, off-instance, and independently persisting storage, as described in the Amazon EBS volumes section in the Amazon Elastic Block Store User Guide.

  6. In Step 3: Advanced configuration, configure optional network settings within cluster and in-and-out of the cluster. Select your own VPC if you already have one that gives SageMaker access to your resources under the VPC. If you want to create a new VPC, see Create a default VPC or Create a VPC in the Amazon Virtual Private Cloud User Guide. If you don't make any selections, it picks up the default VPC of your account.

    Note

    If you want to use your own VPC, you should add additional permissions to the IAM role for SageMaker HyperPod clusters. To learn more, see (Optional) Set up SageMaker HyperPod with your Amazon VPC.

  7. In Step 4: Review and create, review the configuration you have set from Step 1 to Step 3 and finish submitting the cluster creation request.

  8. After the status of the cluster turns to InService, you can start logging into the cluster nodes. To access the cluster nodes and start running ML workloads, see Run jobs on SageMaker HyperPod clusters.

Browse your SageMaker HyperPod clusters

Under Clusters on the SageMaker HyperPod console main page, all created clusters should appear listed under the Clusters section, which provides a summary view of clusters, their ARNs, status, and creation time.

View details of each SageMaker HyperPod cluster

Under Clusters on the console main page, the cluster Names are activated as links. Choose the cluster name link to see details of each cluster.

Edit a SageMaker HyperPod cluster

  1. Under Clusters, choose the cluster you want to update.

  2. Choose Actions button, and choose Edit cluster.

  3. In the Edit <your-cluster> page, you can edit the configurations of existing instance groups, add more instance groups, and change tags for the cluster. After making changes, choose Submit. Note that currently you cannot reduce or delete existing instance groups.

    1. In the Configure instance groups section, you can add more instance groups by choosing Create cluster group.

    2. In the Configure instance groups section, you can choose one of the instance groups, and choose Edit to change its configuration.

    3. In the Tags section, you can update tags for the cluster.

Delete a SageMaker HyperPod cluster

  1. Under Clusters, choose the cluster you want to delete.

  2. Choose Actions, and choose Delete cluster.

  3. In the pop-up window for cluster deletion, review the cluster information carefully to confirm that you chose the right cluster to delete.

  4. After you reviewed the cluster information, choose Yes, delete cluster.

  5. In the text field to confirm this deletion, type delete.

  6. Choose Delete on the lower right corner of the pop-up window to finish sending the cluster deletion request.