Amazon EC2 console output logs - AWS ParallelCluster

Amazon EC2 console output logs

When AWS ParallelCluster detects that a static compute node instance terminates unexpectedly, it attempts to retrieve the Amazon EC2 console output from the terminated node instance after a period of time elapses. This way, if the compute node was unable to communicate with Amazon CloudWatch, useful troubleshooting information on why the node terminated might still be retrieved from the console output. This console output is recorded in the /var/log/parallelcluster/compute_console_output log on the head node. For more information about the Amazon EC2 console output, see Instance console output in the Amazon EC2 User Guide for Linux Instances.

By default, AWS ParallelCluster only retrieves the console output from a sample subset of terminated nodes. This prevents the cluster head node from being overwhelmed with multiple console output requests caused by large numbers of terminations. By default, AWS ParallelCluster waits 5 minutes between termination detection and console output retrieval to give Amazon EC2 time to retrieve the final console output from the nodes.

You can edit the sample size and wait time parameter values in the /etc/parallelcluster/slurm_plugin/parallelcluster_clustermgtd.conf file on the head node.

This feature is added in AWS ParallelCluster version 3.5.0.

Amazon EC2 console output parameters

You can edit the values of the following Amazon EC2 console output parameters in the /etc/parallelcluster/slurm_plugin/parallelcluster_clustermgtd.conf file on the head node.

`compute_console_logging_enabled`

To disable console output log collection, set compute_console_logging_enabled to false. The default is true.

You can update this parameter at any time, without stopping the compute fleet.

`compute_console_logging_max_sample_size`

compute_console_logging_max_sample_size sets the maximum number of compute nodes from which AWS ParallelCluster collects console outputs each time it detects an unexpected termination. If this value is less than 1, AWS ParallelCluster retrieves the console output from all terminated nodes. The default value is 1.

You can update this parameter at any time, without stopping the compute fleet.

`compute_console_wait_time`

compute_console_wait_time sets the time, in seconds, that AWS ParallelCluster waits between detecting a node failure and collecting the console output from that node. You can increase the wait time if you determine that Amazon EC2 needs more time to collect the final output from the terminated node. The default value is 300 seconds (5 minutes).

You can update this parameter at any time, without stopping the compute fleet.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

pcluster CLI logs

Retrieve PCUI and AWS ParallelCluster runtime logs

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

Amazon EC2 console output logs

Amazon EC2 console output parameters

`compute_console_logging_enabled`

`compute_console_logging_max_sample_size`

`compute_console_wait_time`

Did this page help you?

Next topic:

Previous topic:

Need help?