Enabling the Apache Spark web UI for AWS Glue jobs - AWS Glue

Enabling the Apache Spark web UI for AWS Glue jobs

You can use the Apache Spark web UI to monitor and debug AWS Glue ETL jobs running on the AWS Glue job system. You can configure the Spark UI using the AWS Glue console or the AWS Command Line Interface (AWS CLI).

Configuring the Spark UI (console)

Follow these steps to configure the Spark UI using the AWS Management Console.

To create a job with the Spark UI enabled
  1. Sign in to the AWS Management Console and open the AWS Glue console at https://console.aws.amazon.com/glue/.

  2. In the navigation pane, choose Jobs.

  3. Choose Add job.

  4. In Configure the job properties, open the Monitoring options.

  5. In the Spark UI tab, choose Enable.

  6. Specify an Amazon S3 path for storing the Spark event logs for the job. Note that if you use a security configuration in the job, the encryption will also apply to the Spark UI log file. For more information, see Encrypting data written by AWS Glue.

To edit an existing job to enable the Spark UI
  1. Open the AWS Glue console at https://console.aws.amazon.com/glue/.

  2. In the navigation pane, choose Jobs.

  3. Choose an existing job in the job list.

  4. Choose Action, and then choose Edit job.

  5. Open the Monitoring options.

  6. In the Spark UI tab, choose Enable.

  7. Specify an Amazon S3 path for storing the Spark event logs for the job. Note that if you use a security configuration in the job, the encryption will also apply to the Spark UI log file. For more information, see Encrypting data written by AWS Glue.

To set up user preferences for new jobs to enable the Spark UI
  1. Open the AWS Glue console at https://console.aws.amazon.com/glue/.

  2. In the upper-right corner, choose User preferences.

  3. Open the Monitoring options.

  4. In the Spark UI tab, choose Enable.

  5. Specify an Amazon S3 path for storing the Spark event logs for the job. Note that if you use a security configuration in the job, the encryption will also apply to the Spark UI log file. For more information, see Encrypting data written by AWS Glue.

To set up the job run options to enable the Spark UI
  1. Open the AWS Glue console at https://console.aws.amazon.com/glue/.

  2. In the navigation pane, choose Jobs.

  3. Choose an existing job in the job lists.

  4. Choose Scripts and Edit Job. You navigate to the code pane.

  5. Choose Run job.

  6. Open the Monitoring options.

  7. In the Spark UI tab, choose Enable.

  8. Specify an Amazon S3 path for storing the Spark event logs for the job. Note that if you use a security configuration in the job, the encryption will also apply to the Spark UI log file. For more information, see Encrypting data written by AWS Glue.

Configuring the Spark UI (AWS CLI)

To enable the Spark UI feature using the AWS CLI, pass in the following job parameters to AWS Glue jobs. For more information, see Special Parameters Used by AWS Glue.

'--enable-spark-ui': 'true', '--spark-event-logs-path': 's3://s3-event-log-path'

Every 30 seconds, AWS Glue flushes the Spark event logs to the Amazon S3 path that you specify.