Wednesday, December 14, 2016

In place transparent compression of MySQL binary logs

Any DBA who has administered a busy master knows how fast the disk space occupied by binary logs may grow.  DBAs have no control on this: the growth depends on the workload, and the workload depends on many factors, e.g.:

- application changes (the applications start writing more due to code changes)
- traffic changes (the peak season arrive, your workload doubles in size)
- infrastructure changes (the devops add more servers)
- business changes (new business flows adds to existing workload)

So either you have being thoughtful and have planned in advance for a large enough storage space (to handle the increase in number of binary logs), or,  sooner or later, you will face the usual dilemma - how many retention days dare you give up to accommodate for the binlog growth?

In my case, I was very thoughtful, but the boss didn't listen and gave me servers with very limited binlog storage space and, more important, using internal disks so no way to add more disk space on demand. Pity, as I had previously found a way to move binlogs to a different filesystem without downtime. But unfortunately, there is no "different filesystem" this time. I'm stuck with what I have.

So, the obvious answer at first was to reduce the retention of binary logs via the
expire_logs_days variable. Not the ideal option, as in my situation, I had to reduce it to a mere 2 days, not even enough to survive a weekend. At the same time I didn't like the idea to keep more days and be waken up at 3am by a filesystem full alert.

So the idea of compressing the binary logs came to my mind, but I wanted to do it in a transparent way, preserving the existing binlog retention mechanisms (expire_logs_days variable and PURGE BINARY LOGS command should continue to work as before) and without breaking the MySQL server and/or replication. Ideally, the timestamps of the files themselves should also preserved.

I decided to try and compress the binary logs in place. In place means that when you look at your binlogs folder, you don't notice anything unusual.  Here's (a part of ) the listing of one of our servers with compression active:

-rw-rw---- 1 mysql mysql  13631245 Dec 14 14:17 dbmd01.079826
-rw-rw---- 1 mysql mysql  13909321 Dec 14 14:23 dbmd01.079827
-rw-rw---- 1 mysql mysql  13656948 Dec 14 14:30 dbmd01.079828
-rw-rw---- 1 mysql mysql  13915222 Dec 14 14:37 dbmd01.079829
-rw-rw---- 1 mysql mysql 104883640 Dec 14 14:44 dbmd01.079830
-rw-rw---- 1 mysql mysql 104898032 Dec 14 14:50 dbmd01.079831
-rw-rw---- 1 mysql mysql 104861122 Dec 14 14:57 dbmd01.079832

You can spot which files are compressed by looking at the file size, but everything else is preserved. Looking with file, here's what's in each of them:

dbmd01.079826: gzip compressed data
dbmd01.079827: gzip compressed data
dbmd01.079828: gzip compressed data
dbmd01.079829: gzip compressed data
dbmd01.079830: MySQL replication log
dbmd01.079831: MySQL replication log
dbmd01.079832: MySQL replication log

Basically, after compression, the compressed file retains the original file name. This is to keep MySQL happy - you know, these files are tracked in the index file and read when mysqld starts.
Speaking of which, here's what happens when you start mysqld  with in-place compression:

2016-10-26 03:06:13 0 [Note] /usr/sbin/mysqld (mysqld 5.6.34-79.1-log) starting as process 1407 ...
[ ... ]
2016-10-26 03:06:13 1407 [Note] InnoDB:  Percona XtraDB ( 5.6.34-79.1 started; log sequence number 2722915
2016-10-26 03:06:13 1407 [ERROR] Binlog has bad magic number;  It's not a binary log file that can be used by this version of MySQL
2016-10-26 03:06:13 1407 [ERROR] Binlog has bad magic number;  It's not a binary log file that can be used by this version of MySQL
2016-10-26 03:06:13 1407 [ERROR] Binlog has bad magic number;  It's not a binary log file that can be used by this version of MySQL
2016-10-26 03:06:13 1407 [ERROR] Binlog has bad magic number;  It's not a binary log file that can be used by this version of MySQL

Although these entries are flagged as ERROR, actually mysqld just ignores the compressed files, and more importantly, does not touch them.  

This way, compressed binlog files can be managed exactly like uncompressed, standard ones. Of course, I need to remember that if I need to peruse their content, I have to uncompress them before feeding them to the mysqlbinlog utility.

Last but not least, in order to ensure that we do not inadvertently compress binlogs that haven't been downloaded by all slaves yet,  the following extra safety rules are enforced:

- do not compress any binlog which is not at least 1 hour old 
- only compress binlogs that have a sequence number which is less than the current sequence number  of the most lagging slave

The script runs every 10 minutes, extracts the list of the binary logs which aren't compressed yet, applies the above rules, then it proceeds to compress them.

I have set expire_logs_days back to 7 days and have plenty of disk space for extra growth now....

You can find the which I crafted for the above on my GitHub page.

One caveat  - script hasn't been tested with binlog sequence numbers greater than 999,999. I am actually curious to see what happens to the binlog file name when we reach one million - need to either be very patient, or have a look at the source code... :-)