App Mesh troubleshooting best practices
We recommend that you follow the best practices in this topic to troubleshoot issues when using App Mesh.
Enable the Envoy proxy administration interface
The Envoy proxy ships with an administration interface that you can use to discover configuration
and statistics and to perform other administrative functions such as connection draining. For more
information, see Administration interface
If you use the managed Envoy image, the administration endpoint
is enabled by default on port 9901. Examples provided in App Mesh setup troubleshooting display the example administration endpoint URL as
http://my-app.default.svc.cluster.local:9901/
.
Note
The administration endpoint should never be exposed to the public internet. Additionally, we
recommend monitoring the administration endpoint logs, which are set by the
ENVOY_ADMIN_ACCESS_LOG_FILE
environment variable to
/tmp/envoy_admin_access.log
by default.
Enable Envoy DogStatsD integration for metric offload
The Envoy proxy can be configured to offload statistics for OSI Layer 4 and Layer 7 traffic and
for internal process health. While this topic shows how to use these statistics without offloading
the metrics to sinks like CloudWatch metrics and Prometheus., having these statistics in a
centralized location for all of your applications can help you diagnose issues and confirm behavior
more quickly. For more information, see Using Amazon CloudWatch
Metrics and the Prometheus documentation
You can configure DogStatsD metrics by setting the parameters defined in DogStatsD variables. For more information
about DogStatsD, see the DogStatsD
Enable access logs
We recommend enabling access logs on your Virtual nodes and Virtual gateways to discover details about traffic transiting between your
applications. For more information, see Access logging
parse @message "[*] \"* * *\" * * * * * * * * * * *" as StartTime, Method, Path, Protocol, ResponseCode, ResponseFlags, BytesReceived, BytesSent, DurationMillis, UpstreamServiceTimeMillis, ForwardedFor, UserAgent, RequestId, Authority, UpstreamHost
Enable Envoy debug logging in pre-production environments
We recommend setting the Envoy proxy’s log level to debug
in a pre-production
environment. Debug logs can help you identify issues before you graduate the associated App Mesh
configuration to your production environment.
If you’re using the Envoy image, you can set the log level to
debug
through the ENVOY_LOG_LEVEL
environment variable.
Note
We do not recommend using the debug
level in production environments. Setting
the level to debug
increases the logging and may affect performance and the
overall cost of logs offloaded to solutions like CloudWatch Logs.
When you use Envoy’s default format, you can analyze the process logs with CloudWatch Logs Insights using the following parse statement:
parse @message "[*][*][*][*] [*] *" as Time, Thread, Level, Name, Source, Message
Monitor the Envoy Proxy Connectivity with App Mesh control plane
We recommend you monitor the Envoy metrics control_plane.connected_state
to make
sure that the Envoy proxy communicates with the App Mesh control plane to fetch the dynamic
configuration resources. For more information, see Management Server