Optimizing binary log replication for Aurora MySQL
Following, you can learn how to optimize binary log replication performance and troubleshoot related issues in Aurora MySQL.
Tip
This discussion presumes that you are familiar with the MySQL binary log replication mechanism and how it works. For
background information, see Replication Implementation
Multithreaded binary log replication
With multithreaded binary log replication, a SQL thread reads events from the relay log and queues them up for SQL worker threads to apply. The SQL worker threads are managed by a coordinator thread. The binary log events are applied in parallel when possible.
Multithreaded binary log replication is supported in Aurora MySQL version 3, and in Aurora MySQL version 2.12.1 and higher.
When an Aurora MySQL DB instance is configured to use binary log replication, by
default the replica instance uses single-threaded replication for Aurora MySQL
versions lower than 3.04. To enable multithreaded replication, you update the
replica_parallel_workers
parameter to a value greater than zero
in your custom parameter group.
For Aurora MySQL version 3.04 and higher, replication is multithreaded by default, with
replica_parallel_workers
set to 4
. You can modify this parameter in your custom parameter
group.
The following configuration options help you to fine-tune multithreaded replication. For usage information, see Replication and Binary Logging Options and
Variables
Optimal configuration depends on several factors. For example, performance for binary log replication is influenced by your database workload characteristics and the DB instance class the replica is running on. Thus, we recommend that you thoroughly test all changes to these configuration parameters before applying new parameter settings to a production instance:
-
binlog_group_commit_sync_delay
-
binlog_group_commit_sync_no_delay_count
-
binlog_transaction_dependency_history_size
-
binlog_transaction_dependency_tracking
-
replica_preserve_commit_order
-
replica_parallel_type
-
replica_parallel_workers
In Aurora MySQL version 3.06 and higher, you can improve performance for binary log replicas when replicating
transactions for large tables with more than one secondary index. This feature introduces a thread pool to apply
secondary index changes in parallel on a binlog replica. The feature is controlled by the aurora_binlog_replication_sec_index_parallel_workers
DB cluster parameter, which controls the total number
of parallel threads available to apply the secondary index changes. The parameter is set to 0
(disabled) by
default. Enabling this feature doesn't require an instance restart. To enable this feature, stop ongoing replication,
set the desired number of parallel worker threads, and then start replication again.
You can also use this parameter as a global variable, where n
is the number of parallel
worker threads:
SET global aurora_binlog_replication_sec_index_parallel_workers=
n
;
Optimizing binlog replication (Aurora MySQL 2.10 and higher)
In Aurora MySQL 2.10 and higher, Aurora automatically applies an optimization known as the binlog I/O cache to binary log replication. By caching the most recently committed binlog events, this optimization is designed to improve binlog dump thread performance while limiting the impact to foreground transactions on the binlog source instance.
Note
This memory used for this feature is independent of the MySQL binlog_cache
setting.
This feature doesn't apply to Aurora DB instances that use the db.t2
and db.t3
instance classes.
You don't need to adjust any configuration parameters to turn on this optimization. In particular, if you adjust
the configuration parameter aurora_binlog_replication_max_yield_seconds
to a nonzero value in earlier
Aurora MySQL versions, set it back to zero for Aurora MySQL 2.10 and higher.
The status variables aurora_binlog_io_cache_reads
and aurora_binlog_io_cache_read_requests
are available in Aurora MySQL 2.10 and higher. These status variables help you to monitor how often the data is read from
the binlog I/O cache.
-
aurora_binlog_io_cache_read_requests
shows the number of binlog I/O read requests from the cache. -
aurora_binlog_io_cache_reads
shows the number of binlog I/O reads that retrieve information from the cache.
The following SQL query computes the percentage of binlog read requests that take advantage of the cached information. In this case, the closer the ratio is to 100, the better it is.
mysql> SELECT (SELECT VARIABLE_VALUE FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME='aurora_binlog_io_cache_reads') / (SELECT VARIABLE_VALUE FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME='aurora_binlog_io_cache_read_requests') * 100 as binlog_io_cache_hit_ratio; +---------------------------+ | binlog_io_cache_hit_ratio | +---------------------------+ | 99.99847949080622 | +---------------------------+
The binlog I/O cache feature also includes new metrics related to the binlog dump threads. Dump threads are the threads that are created when new binlog replicas are connected to the binlog source instance.
The dump thread metrics are printed to the database log every 60 seconds with the prefix [Dump thread
metrics]
. The metrics include information for each binlog replica such as Secondary_id
,
Secondary_uuid
, binlog file name, and the position that each replica is reading. The metrics also
include Bytes_behind_primary
representing the distance in bytes between replication source and replica.
This metric measures the lag of the replica I/O thread. That figure is different from the lag of the replica SQL applier
thread, which is represented by the seconds_behind_master
metric on the binlog replica. You can determine
whether binlog replicas are catching up to the source or falling behind by checking whether the distance decreases or
increases.
Optimizing binlog replication (Aurora MySQL version 2 through 2.09)
To optimize binary log replication for Aurora MySQL, you adjust the following cluster-level optimization parameters. These parameters help you to specify the right balance between latency on the binlog source instance and replication lag.
-
aurora_binlog_use_large_read_buffer
-
aurora_binlog_read_buffer_size
-
aurora_binlog_replication_max_yield_seconds
Note
For MySQL 5.7-compatible clusters, you can use these parameters in Aurora MySQL version 2 through 2.09.*. In Aurora MySQL 2.10.0 and higher, these parameters are superseded by the binlog I/O cache optimization and you don't need to use them.
Topics
Overview of the large read buffer and max-yield optimizations
You might experience reduced binary log replication performance when the binary log dump thread accesses the
Aurora cluster volume while the cluster processes a high number of transactions. You can use the parameters
aurora_binlog_use_large_read_buffer
, aurora_binlog_replication_max_yield_seconds
, and
aurora_binlog_read_buffer_size
to help minimize this type of contention.
Suppose that you have a situation where aurora_binlog_replication_max_yield_seconds
is set to greater
than 0 and the current binlog file of the dump thread is active. In this case, the binary log dump thread waits up
to a specified number of seconds for the current binlog file to be filled by transactions. This wait period avoids
contention that can arise from replicating each binlog event individually. However, doing so increases the replica
lag for binary log replicas. Those replicas can fall behind the source by the same number of seconds as the
aurora_binlog_replication_max_yield_seconds
setting.
The current binlog file means the binlog file that the dump thread is currently reading to perform replication. We consider that a binlog file is active when the binlog file is updating or open to be updated by incoming transactions. After Aurora MySQL fills up the active binlog file, MySQL creates and switches to a new binlog file. The old binlog file becomes inactive. It isn't updated by incoming transactions any longer.
Note
Before adjusting these parameters, measure your transaction latency and throughput over time. You might find that binary log replication performance is stable and has low latency even if there is occasional contention.
aurora_binlog_use_large_read_buffer
-
If this parameter is set to 1, Aurora MySQL optimizes binary log replication based on the settings of the parameters
aurora_binlog_read_buffer_size
andaurora_binlog_replication_max_yield_seconds
. Ifaurora_binlog_use_large_read_buffer
is 0, Aurora MySQL ignores the values of theaurora_binlog_read_buffer_size
andaurora_binlog_replication_max_yield_seconds
parameters. aurora_binlog_read_buffer_size
-
Binary log dump threads with larger read buffer minimize the number of read I/O operations by reading more events for each I/O. The parameter
aurora_binlog_read_buffer_size
sets the read buffer size. The large read buffer can reduce binary log contention for workloads that generate a large amount of binlog data.Note
This parameter only has an effect when the cluster also has the setting
aurora_binlog_use_large_read_buffer=1
.Increasing the size of the read buffer doesn't affect the performance of binary log replication. Binary log dump threads don't wait for updating transactions to fill up the read buffer.
aurora_binlog_replication_max_yield_seconds
-
If your workload requires low transaction latency, and you can tolerate some replication lag, you can increase the
aurora_binlog_replication_max_yield_seconds
parameter. This parameter controls the maximum yield property of binary log replication in your cluster.Note
This parameter only has an effect when the cluster also has the setting
aurora_binlog_use_large_read_buffer=1
.
Aurora MySQL recognizes any change to the aurora_binlog_replication_max_yield_seconds
parameter value
immediately. You don't need to restart the DB instance. However, when you turn on this setting, the dump thread
only starts to yield when the current binlog file reaches its maximum size of 128 MB and is rotated to a new file.
Related parameters
Use the following DB cluster parameters to turn on binlog optimization.
Parameter | Default | Valid Values | Description |
---|---|---|---|
aurora_binlog_use_large_read_buffer
|
1 | 0, 1 | Switch for turning on the feature of replication improvement. When its value is 1, the binary log
dump thread uses aurora_binlog_read_buffer_size for binary log replication; otherwise
default buffer size (8K) is used. Not used in Aurora MySQL version 3. |
aurora_binlog_read_buffer_size
|
5242880 | 8192-536870912 | Read buffer size used by binary log dump thread when the parameter
aurora_binlog_use_large_read_buffer is set to 1. Not used in Aurora MySQL version
3. |
aurora_binlog_replication_max_yield_seconds
|
0 | 0-36000 |
For Aurora MySQL version 2.07.*, the maximum accepted value is 45. You can tune it to a higher value on 2.09 and later versions. For version 2, this parameter works only when the parameter
|
Enabling the max-yield mechanism for binary log replication
You can turn on the binary log replication max-yield optimization as follows. Doing so minimizes latency for transactions on the binlog source instance. However, you might experience higher replication lag.
To turn on the max-yield binlog optimization for an Aurora MySQL cluster
-
Create or edit a DB cluster parameter group using the following parameter settings:
-
aurora_binlog_use_large_read_buffer
: turn on with a value ofON
or 1. -
aurora_binlog_replication_max_yield_seconds
: specify a value greater than 0.
-
-
Associate the DB cluster parameter group with the Aurora MySQL cluster that works as the binlog source. To do so, follow the procedures in Parameter groups for Amazon Aurora.
-
Confirm that the parameter change takes effect. To do so, run the following query on the binlog source instance.
SELECT @@aurora_binlog_use_large_read_buffer, @@aurora_binlog_replication_max_yield_seconds;
Your output should be similar to the following.
+---------------------------------------+-----------------------------------------------+ | @@aurora_binlog_use_large_read_buffer | @@aurora_binlog_replication_max_yield_seconds | +---------------------------------------+-----------------------------------------------+ | 1 | 45 | +---------------------------------------+-----------------------------------------------+
Turning off the binary log replication max-yield optimization
You can turn off the binary log replication max-yield optimization as follows. Doing so minimizes replication lag. However, you might experience higher latency for transactions on the binlog source instance.
To turn off the max-yield optimization for an Aurora MySQL cluster
-
Make sure that the DB cluster parameter group associated with the Aurora MySQL cluster has
aurora_binlog_replication_max_yield_seconds
set to 0. For more information about setting configuration parameters using parameter groups, see Parameter groups for Amazon Aurora. -
Confirm that the parameter change takes effect. To do so, run the following query on the binlog source instance.
SELECT @@aurora_binlog_replication_max_yield_seconds;
Your output should be similar to the following.
+-----------------------------------------------+ | @@aurora_binlog_replication_max_yield_seconds | +-----------------------------------------------+ | 0 | +-----------------------------------------------+
Turning off the large read buffer
You can turn off the entire large read buffer feature as follows.
To turn off the large binary log read buffer for an Aurora MySQL cluster
-
Reset the
aurora_binlog_use_large_read_buffer
toOFF
or 0.Make sure that the DB cluster parameter group associated with the Aurora MySQL cluster has
aurora_binlog_use_large_read_buffer
set to 0. For more information about setting configuration parameters using parameter groups, see Parameter groups for Amazon Aurora. -
On the binlog source instance, run the following query.
SELECT @@ aurora_binlog_use_large_read_buffer;
Your output should be similar to the following.
+---------------------------------------+ | @@aurora_binlog_use_large_read_buffer | +---------------------------------------+ | 0 | +---------------------------------------+