Appendix II - AWS Prescriptive Guidance

Appendix II

This section provides the benchmarking results for pt-archiver tools in different scenarios. The sysbench tool is used in this testing to put load on the database. All the tests are performed on Amazon RDS for MySQL instance running on db.r6g.8xlarge instance class.

The following sysbench commands were used to prepare and run the load on the database:

sysbench oltp_read_write --db-driver=mysql --mysql-db=<DATABASE> --mysql-user=<USER> --mysql-password=<PASSWORD> --mysql-host=<ENDPOINT> --tables=1000 --table-size=2000000 --threads=500 prepare  sysbench oltp_read_write --db-driver=mysql --mysql-db=<DATABASE> --mysql-user=<USER> --mysql-password=<PASSWORD> --mysql-host=<ENDPOINT> --tables=1000 --rate=500 --threads=500 run 

Archiving a table that has no primary key and only one index (no load on the database)

Started at 2022-11-07T05:29:12, ended at 2022-11-07T06:03:31 Action     Count   Time    Pct commit    600050 1715.3582   83.31 select    300025  166.5470    8.09 inserting   300024  165.4025    8.03 other       0  11.6644    0.57

It took around 34 minutes to archive 300,024 rows. This table had 2 million rows, but the tool archived only the rows with unique data for the indexed column.

Archiving a table that has a primary key (no load on the database)

Started at 2022-11-16T08:53:49, ended at 2022-11-16T12:38:18 Action         Count   Time      Pct commit       4000000 11065.9534      82.16 select       2000000  1278.1854       9.49 inserting    1999999  1050.4961       7.80 other              0    74.1519       0.55

It took around 3 hours, 44 minutes, and 29 seconds to archive 1,999,999 rows.

The following graph shows that pt-archiver consumes very little CPU and resources when run on its own without any load existing in the system.

During the run the percentage increases by 0.2.

Archiving table that has a primary key (with load on the database)

Started at 2022-11-16T17:37:07, ended at 2022-11-17T03:20:43 Action         Count   Time        Pct commit       4000000 19688.8362      56.23 inserting    1999999 13933.4418      39.79 select       2000000  1305.1770       3.73 other              0    89.1787       0.25

It took around 9 hours, 43 minutes, and 36 seconds to archive 1999999 rows.

The following graph shows that during the test, the CPU utilization was up to 15 percent due to the load applied by sysbench. After the load completed, pt-archiver continued to work consuming minimal CPU as expected to complete the archival.

The graphs spike up at the beginning of the run and remain elevated until the run completes.

As is evident from the graphs, pt-archiver doesn't archive aggressively when there is a load on your database.