Search write throttling
To maintain optimal performance and data durability, ElastiCache in durable mode implements write throttling on search traffic when necessary. Throttling helps ensure that automatic backup mechanisms operate effectively without falling behind during periods of high write activity. By temporarily reducing write throughput, the system preserves the integrity of the Multi-AZ transactional log, which is essential for fast database recovery and restart.
Scope of throttling
Only write commands targeting keys that belong to a search index are throttled. Writes to non-indexed keys and all read commands are not affected.
The following commands are subject to throttling when they target indexed keys:
| Category | Commands |
|---|---|
| Hash | HSET, HSETNX, HMSET, HINCRBY,
HINCRBYFLOAT, HDEL |
| JSON | JSON.SET, JSON.DEL, JSON.NUMINCRBY,
JSON.NUMMULTBY, JSON.STRAPPEND, JSON.ARRAPPEND,
JSON.ARRINSERT, JSON.ARRPOP, JSON.ARRTRIM,
JSON.TOGGLE, JSON.CLEAR, JSON.MERGE |
| Generic | DEL, UNLINK, RENAME, RENAMENX,
COPY, RESTORE |
What clients experience
Throttled commands are delayed, not rejected. Affected writes take longer to complete but still succeed. No errors are returned to the client.
You can observe the impact through the following Amazon CloudWatch metrics:
SuccessfulWriteRequestLatencyandSearchBasedSetCmdsLatency— Reflect increased latency on affected writes.SearchWriteThrottleActive,SearchWriteThrottledClientsCount, andSearchWriteThrottleEvents— Indicate whether throttling is active and to what degree. See Monitoring for details.
When throttling activates
The system monitors CPU usage of the search module's writer threads over a rolling 2-hour window. Throttling activates when the average CPU usage during that window exceeds 50%, and adjusts the allowed write rate to bring average utilization back to 50%.
Because the metric is averaged over a 2-hour window, short bursts of elevated CPU usage do not trigger throttling on their own — as long as they are offset by lower usage within the same window.
To prevent over-throttling, the system also evaluates current CPU usage in real time. If current CPU usage is already at 50% or below, the system holds the write rate steady rather than reducing it further, even if the 2-hour average remains above the threshold. This ensures that write capacity never falls below 50% of normal throughput.
When throttling deactivates
Once the 2-hour average CPU usage drops below 50%, the system gradually increases the allowed write rate until full throughput is restored and throttling deactivates.
Monitoring
The following Amazon CloudWatch metrics are available for monitoring search write throttling:
| Metric | Description | Unit |
|---|---|---|
SearchWriteThrottleActive |
Indicates whether throttling is currently active. 1 = active,
0 = inactive. |
Boolean |
SearchWriteThrottledClientsCount |
The number of client connections currently being throttled. | Count |
SearchWriteThrottleEvents |
The number of throttle events within the reporting interval. | Count |
SearchWriteCPUUtilization |
Current CPU utilization of the search writer threads. | Percent |
Best practices
Monitor
SearchWriteCPUUtilization— Track your search write CPU usage to understand your workload patterns and anticipate when you might approach the throttling threshold.Monitor
SearchWriteThrottleActive— Track whether throttling is active so you can investigate and respond promptly.Plan sustained ingestion around the 2-hour window — The system uses a 2-hour rolling average, so short bursts of high write activity are fully supported as long as they are offset by lower usage within the same window.
Scale your cluster if you observe sustained or frequent throttling — If your workload consistently exceeds the threshold and throttling impacts your application's latency requirements, consider scaling to add capacity.