I have a MySQL Galera cluster, using Perconadb and Xtrabackup. The nodes can start stand-alone, or can join the cluster if only an IST is required. However, if an SST is required, then this runs to completion and then fails.
The logs show that, after the xtrabackup SST is completed, it exits with stats 22 (Invalid Argument) causing the SST to be rolled back and the node fails to come up.
2018-08-09 00:43:25 860 [Note] WSREP: 0.0 (xmdadb01): State transfer to 1.0 (xmdadb02) complete.
2018-08-09 00:43:25 860 [Note] WSREP: Member 0.0 (xmdadb01) synced with group.
2018-08-09 00:43:25 860 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.93.40.122' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '860' '' : 22 (Invalid argument)
2018-08-09 00:43:25 860 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
2018-08-09 00:43:25 860 [ERROR] WSREP: SST script aborted with error 22 (Invalid argument)
2018-08-09 00:43:25 860 [ERROR] WSREP: SST failed: 22 (Invalid argument)
2018-08-09 00:43:25 860 [ERROR] Aborting
The relevant parts of the my.cnf:
[mysqld]
wsrep_provider=/usr/lib64/galera3/libgalera_smm.so
wsrep_provider_options="gcache.size=256M;gcs.fc_factor=1.0;gcs.fc_limit=512;gcs.fc_master_slave=YES;pc.checksum=true;"
wsrep_cluster_name="galera01-xmd"
wsrep_cluster_address="gcomm://10.93.40.121:4567,10.93.40.122:4567"
wsrep_node_name=xmdadb02
wsrep_node_address="10.93.40.122"
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sst_user:password-goes-in-here
As the SST runs, I can see the files coming over into /var/lib/mysql/.sst
, so I know this is working. I have verified the user and password are correct. However, why is the xtrabackup-v2 returning 22, and how can I stop it from doing so in order for the SST to complete?
Annoyingly, when this setup was first installed, SST worked without issue. I do not know what changed in the intervening time to prevent SST while still allowing IST to work.
I find that the reasons SST fails regularly falls under one of the following: SElinux/AppArmor is Enforcing, SST user was not created on donor node (and permissions not updated correctly in .cnf files), IPTables/Firewall restrictions over 4444. In most cases, correcting those allows SST to work.
Because galera has a creative outlook on what constitutes a meaningful error message, don't expect EINVAL 22 to correspond to a syscall return code.
Take a look at some of code around this EINVAL text in their code.
fixing isn't a priority.
There are many reasons why SST and IST can fail, and some have been given by other posters; however in our case, the problem seems to have been that the xtrabackup SST script is more picky about the mysql.cnf than MySQL itself, and fails with this error when the parser has issues.
In this case, the issue was that some of the config directives were in the file more than once (though with the same value). MySQL happily passes this, but xtrabackup parser turned it into a multi-valued array which was an invalid data type so it choked.
Removing the additional duplicate config lines solved the issue.
Note this only affected xtrabackup SST -- an IST has always worked fine, and MySQL itself (plus mysqldump etc) are quite happy.
Try opening the
innobackupex
log, for example on Debian it's located at/var/lib/mysql/innobackup.backup.log
I've found that my issue on the donor was
InnoDB: Error number 24 means 'Too many open files'.
, soulimit -n
would help :-)EDIT: found out that there's another line of log:
xtrabackup: open files limit requested 200000, set to 1024
As a matter of fact, I've used:But MySQL reduces it to 1024 (or 5000), so it's another thing to tweak:
(and remove the one in
[xtrabackup]
which is useless)