Data stored in Azure Storage accounts is very securely protected by Microsoft. There is very little (if any) reason to worry about catastrophic equipment failure causing your data to be lost. It’s the beauty of the cloud! However, Microsoft can’t protect our data from inadvertent user errors that we or our software might make that could corrupt or destroy our data. Because of this, backups are still a very necessary part of life in the cloud.
Unfortunately, there really wasn’t an easy or cost effective way to backup table data from an Azure Storage Account (as of 3/1/2016). Microsoft doesn’t offer an app on Azure for tables and, while there are a few backup players, their solutions are pretty costly. Through searches on the interwebs I have found that I am not the only person with this dilemma, so I set out to figure out how to do this quickly, effectively, reliably, and cheaply.
My first stop was at the new and cool Azure Storage Data Movement Library (DML). My thought was that I could use the DML library in an Azure Web Job. Everything would be contained, everything would be in the cloud, everything would be tight. However, I was disappointed to find that the DML does not yet support tables. Further, offers to add table support to the open-source DML by the community was met with delay by Microsoft, as Microsoft’s reps said to wait, that table support would be forthcoming. That’s great, but that doesn’t help me right now.
So, with some poking around, I put together a solution that doesn’t cause too much pain and get the job done effectively. My solution uses the command line AZCopy.exe tool from Microsoft and a batch file with a few tweaks to backup a list of tables and blob containers. To make my backup work, I spun up a virtual machine in Azure, using the cheapest available configuration (A0) and loaded AZCopy. I also copied the backup.bat file onto the machine. I then used the Task Scheduler to call my backup.bat file at a given interval. When the scheduler hits the bat file (in my case once a day at midnight), it pulls all of the table blob data to my virtual machine and then pushes the data back out to a backup storage account for safekeeping.
Later, if I experience a catastrophic screw up of either my or of an infinite loop’s proportions, I can restore the data to a new storage account, do some testing, and then cut over my web app to the new storage.
You can checkout the backup and restore batch files here. Bear in mind that I would like to improve the restore batch file at some point so that it is not necessary to spell out every table manifest file to restore. If you have ideas or solutions, please contribute back.
Photo credit: M i x y via VisualHunt.com / CC BY-NC-SA