Troubleshooting for Aurora MySQL in-place upgrade
Use the following tips to help troubleshoot problems with Aurora MySQL in-place upgrades. These tips don't apply to Aurora Serverless DB clusters.
Reason for in-place upgrade being canceled or slow | Effect | Solution to allow in-place upgrade to complete within maintenance window |
---|---|---|
Associated Aurora cross-Region replica isn't patched yet | Aurora cancels the upgrade. | Upgrade the Aurora cross-Region replica and try again. |
Cluster has XA transactions in the prepared state | Aurora cancels the upgrade. | Commit or roll back all prepared XA transactions. |
Cluster is processing any data definition language (DDL) statements | Aurora cancels the upgrade. | Consider waiting and performing the upgrade after all DDL statements are finished. |
Cluster has uncommitted changes for many rows | Upgrade might take a long time. |
The upgrade process rolls back the uncommitted changes. The indicator for this condition is the value of Consider performing the upgrade only after all large transactions are committed or rolled back. |
Cluster has high number of undo records | Upgrade might take a long time. |
Even if the uncommitted transactions don't affect a large number of rows, they might involve a large volume of data. For example, you might be inserting large BLOBs. Aurora doesn't automatically detect or generate an event for this kind of transaction activity. The indicator for this condition is the history list length (HLL). The upgrade process rolls back the uncommitted changes. You can check the HLL in the output from the
You can also monitor the Consider performing the upgrade only after the HLL is smaller. |
Cluster is in the process of committing a large binary log transaction | Upgrade might take a long time. |
The upgrade process waits until the binary log changes are applied. More transactions or DDL statements could start during this period, further slowing down the upgrade process. Schedule the upgrade process when the cluster isn't busy with generating binary log replication changes. Aurora doesn't automatically detect or generate an event for this condition. |
Schema inconsistencies resulting from file removal or corruption | Aurora cancels the upgrade. |
Change the default storage engine for temporary tables from MyISAM to InnoDB. Perform the following steps:
|
Master user was deleted | Aurora cancels the upgrade. |
ImportantDon't delete the master user. However, if for some reason you should happen to delete the master user, restore it using the following SQL commands:
|
For more details on troubleshooting issues that cause upgrade prechecks to fail, see the following blogs:
You can use the following steps to perform your own checks for some of the conditions in the preceding table. That way, you can schedule the upgrade at a time when you know the database is in a state where the upgrade can complete successfully and quickly.
-
You can check for open XA transactions by executing the
XA RECOVER
statement. You can then commit or roll back the XA transactions before starting the upgrade. -
You can check for DDL statements by executing a
SHOW PROCESSLIST
statement and looking forCREATE
,DROP
,ALTER
,RENAME
, andTRUNCATE
statements in the output. Allow all DDL statements to finish before starting the upgrade. -
You can check the total number of uncommitted rows by querying the
INFORMATION_SCHEMA.INNODB_TRX
table. The table contains one row for each transaction. TheTRX_ROWS_MODIFIED
column contains the number of rows modified or inserted by the transaction. -
You can check the length of the InnoDB history list by executing the
SHOW ENGINE INNODB STATUS SQL
statement and looking for theHistory list length
in the output. You can also check the value directly by running the following query:SELECT count FROM information_schema.innodb_metrics WHERE name = 'trx_rseg_history_len';
The length of the history list corresponds to the amount of undo information stored by the database to implement multi-version concurrency control (MVCC).