Home > Uncategorized > That “Oh shit” moment

That “Oh shit” moment

A previous encounter with a failed upgrade that required recovery (another story for another time) left a sour taste in my mouth regarding the state of backups that were in place. As a brief overview, a nightly script attempts to iterate over every database that exists on the read-only slave and, assuming the database is not set to replicate-ignore-db, it tries to run mysqldump against the database. That’s it. No frills, bells, or whistles. Seems fairly strait forward and simple, right? Well, this is no where near good enough for the environment that I work on. First off, if the database is indeed in replicate-ignore-db, the script locks and dumps the data from the live master…. every night…. That means that the database is essentially down without notice every night for as long as the backup takes. For most databases on our servers, this goes unnoticed as they are < 5G and only see maybe a minute of impact. For others though…. well 150G data sets don’t dump instantly. Second, if you read closely, I mentioned that it “tries” to dump the database, if it runs into any issues it fails silently leaving an unusable backup. Third, notice that I didn’t mention a thing about bin-logs, that’s because they don’t get backed up. If this means nothing to you yet, this means that point in time recovery is impossible. And lastly, I mentioned in a previous post that the environment is currently undergoing an upgrade from 5.0 to 5.1. Well if you are not in the know, 5.0 only support statement replication. This setting carried over to 5.1 for the purposes of upgrading (I am hoping to change this in the near future) and thus we have a fair amount data drift that virtually never gets corrected. That’s right! No data consistency checks. This whole setup reminds me of a common FPS phrase “spray ‘n pray”.

So over the past weeks, I have been devising a replacement solution. The solution has to accomplish the following:

  • reliable recovery
  • quick recovery
  • consistent recovery
  • no customer impact
  • ability to give customers text based dumps
  • back up nightly
  • restore from up to 72 hrs ago to any point in time

So with all this information, I got to work…

Problem #1 no customer impact

This was fortunately mostly already solved for me. We already have slaves replicating from the master that we can take backups from at our leisure.


Problem #2 consistent recovery

This definitely needs some attention as we already have some pretty serious data drift. Couple that with customers who are at liberty to write any non deterministic query they so choose… you have a recipe for a shit time. If you don’t already know what a non-deterministic query is, it’s a query that returns different results depending on when it was ran. a basic example would be “INSERT INTO `table` (`var1`) VALUES (UUID());”. As you can see this would be different every time, that includes the replicating slave if it is set to replicate in statement mode. The answer? Well, the one I am looking at is firstly, switch to mixed mode replication. This should replicate most non-deterministic queries as the affected rows data rather than the statements to “attempt” to recreate said data. The second part is to use pt-table-checksum to verify the consistency of each database at a minimum of once every 3 days and generate alerts whenever inconsistencies are found. As for automatically rectifying these inconsistencies, that part is up in the air for now as pt-table-sync’s default method of repair has the potential to destroy your known good copy.


Problem #3 reliable recovery

As I mentioned earlier, the currently implemented solution only “tries” to create a backup, there is know validation and no reporting if one fails and therefor no reliable recovery. So, I say, lets strip this part out. Let’s instead look towards LVM for binary backups, take the snapshot, move it off the server, delete the snapshot…. profit! There are 2 routes that this can be done, and I have ultimately not made a decision which is better yet, but here they are. Option 1) shutdown MySQL, take the snapshot, start MySQL ….ect Option 2) stop the slave thread, flush tables, take the snapshot, start the slave thread (optionally start a second MySQL instance on the snapshot to go through innodb recovery). Let me touch a little on the pros and cons. Option 1 has the advantage of not having to worry about crash recovery at all  and being more simplistic, as well as sharing the same fate as the backup, if MySQL comes up cleanly then the backup should as well. On the down side, option 1 also flushes all the caches so in the need of fail over it may take longer to get up to speed, also, logic would have to be built in to ensure that MySQL is not in a failed over state before shutting it down. You definitely don’t want to bring down your DR box when you are already in a DR situation. Lastly, this would flood our monitoring agents with false alarms every night that MySQL was restarted so these alerts would need to be ignored, but then what if there where a legit restart that got missed? On to option 2…. Option 2 is slightly more complex, though not much. The caches stay warm so the box is ready for fail over at any time. No need for logic to ensure that the box is not live, even if it is, a snapshot will degrade performance but not bring it down (in this environment, that is acceptable). Cons though include either going through innodb crash recovery or assuming that the backup can be restored from easily (I prefer the former), there is also the possibility that the slave may get stopped midway though an operation loading data from a tmp table/location that did not get completed and the tmp data would not get backed up. All in all I am leaning towards Option 2.


Problem #4 quick recovery

Having binary backups from LVM accomplishes this nicely. It doesn’t get much quicker than copying the data files into place and turning on MySQL.


Problem #5 ability to give customers text based dumps

Now this is where some trickery is needed. Since the system backups are binary copies of the data files, we can’t easily just hand them over to customers, especially if they are using innodb (which is our default storage engine). Customers also have come to expect that we will provide them a historical dump upon request even though we clearly state that backups are THEIR responsibility. But…. lets see what we can do. So this piece is still very much a work in progress. As I have mentioned before, one server may have hundreds of databases on it. Keeping that in mind, my options have to preserve the backup in its original backup-y form. My current thought is to export an iSCSI target to each MySQL instance that is used to offload the nightly snapshots. These iSCSI targets would use LVM and whenever a dump is requested, a snapshot of the backup would be taken and mounted R/W to a recovery location, lets say, /mnt/recover-<db_name> (gotta account for multiple concurrent recoveries). So with the snapshot mounted in RW a “recovery” instance of MySQL is started on the recovery location and bin-logs are played to the point in time requested. After bin-logs get ran, in comes mysqldump to spit out all of the customer data to a sql file which gets handed off to the customer. Then, like good lil’ children, we clean up our mess by un-mounting and deleting the snapshot that was used for recovery leaving us the original backup in its pristine state. Like I said though, this is very much a work in progress. A few of the down sides include: hundreds of iSCSI targets, wasted space between each target needed for snapshots, expensive (but that part just comes with binary backups).


Problem #6 nightly backups with 72hr retention

Now this is simply a policy, not much to flush out here, except, now that we are talking about binary backups rather than gzip’d text dumps, we are looking at a 300%+ increase in storage needs (assuming no compression of the binary files). I don’t have a solution for this except… buy storage…. I know, it sucks, but that’s what I got.

So that’s a quick overview of how I plan to revamp the backup plan for my current environment. Got any ideas or suggestions? Let me know!


Leave a Reply

Your email address will not be published.