I got a warning in my AWS console about the InnoDB history list length increasing on one cluster. Taking a look at show engine innodb status
, I see that the number in the "Purge done" line isn't changing from run to run:
------------
TRANSACTIONS
------------
Trx id counter 218703222601
Purge done for trx's n:o < 218637806467 undo n:o < 0 state: running but idle <---- Not changing
History list length 29748626
while the trx ID and list length are both increasing. There are no open transactions
mysql> SELECT * FROM information_schema.innodb_trx\G
Empty set (0.00 sec)
AWS's recommendations for fixing this are unhelpful, talking about long running transactions or tuning queries, neither of which is an issue (all queries running are in the sub-second range, and most of them are read queries).
mysql> show global variables like 'innodb%purge%';
+--------------------------------------+-------+
| Variable_name | Value |
+--------------------------------------+-------+
| innodb_max_purge_lag | 0 |
| innodb_max_purge_lag_delay | 0 |
| innodb_purge_batch_size | 3600 |
| innodb_purge_rseg_truncate_frequency | 128 |
| innodb_purge_threads | 6 |
+--------------------------------------+-------+
Not sure if any of that is helpful, but I'm not sure what else to look at.
I got a warning in my AWS console about the InnoDB history list length increasing on one cluster. Taking a look at show engine innodb status
, I see that the number in the "Purge done" line isn't changing from run to run:
------------
TRANSACTIONS
------------
Trx id counter 218703222601
Purge done for trx's n:o < 218637806467 undo n:o < 0 state: running but idle <---- Not changing
History list length 29748626
while the trx ID and list length are both increasing. There are no open transactions
mysql> SELECT * FROM information_schema.innodb_trx\G
Empty set (0.00 sec)
AWS's recommendations for fixing this are unhelpful, talking about long running transactions or tuning queries, neither of which is an issue (all queries running are in the sub-second range, and most of them are read queries).
mysql> show global variables like 'innodb%purge%';
+--------------------------------------+-------+
| Variable_name | Value |
+--------------------------------------+-------+
| innodb_max_purge_lag | 0 |
| innodb_max_purge_lag_delay | 0 |
| innodb_purge_batch_size | 3600 |
| innodb_purge_rseg_truncate_frequency | 128 |
| innodb_purge_threads | 6 |
+--------------------------------------+-------+
Not sure if any of that is helpful, but I'm not sure what else to look at.
AWS support got back to me, and asked me to run this to get a list of the old transactions across all nodes:
SELECT server_id, IF(session_id = 'master_session_id',
'writer', 'reader') AS ROLE,
replica_lag_in_msec,
oldest_read_view_trx_id ,
oldest_read_view_lsn
from mysql.ro_replica_status;
That told me that a reader was holding a transaction. Found a select
statement on that reader that had been stuck for several days. Killed that query and the problem is resolved.