There’s nothing worse than that awful feeling you get when you realize you’ve accidentally deleted that all important business email, or wiped out a web project that you’ve been working so hard on over the past couple of weeks. It is, therefore, crucial to have handy backups of your hosting account so that unpleasant feeling doesn’t last for long.
What’s not always so cut and dry, is the process of backing up your website. Here at FastComet, we’ve offered automatic and on-demand daily backups of your website since day one. We backup up your site each and every night, making sure that your important data is safe and secure, and that you can revert to a previous version in the event that you made a change that broke the site, or for any other reason. Using our backup system, you can restore files, databases, email accounts or even your SSL certificates and DNS configuration. And last month, we made backup our points more secure, faster and more efficient. Here is how we did it.
FastComet, where are my backups?
In case you have been that unfortunate and experienced issues getting a fresh copy of your data during the last month, though, the chances are that you have already become familiar with the nasty hiccup of our backup service and the Maintenance mode that followed. In this article, we’ll explain how we proactively improved the whole system.
We have been working with a reputable third-party data storage provider for several years, but along with the rapid customer base growth, the volume of website backups started to increase exponentially. Our incumbent storage provider’s platform began to experience difficulty scaling to keep up with the growth — which had detrimental effects on customer experience and support costs, and it literally became a bit of a nightmare.
Going into details, we need to start from scratch when describing our backup service. It generates incremental daily backups for each of our shared hosting clients. The incremental backup will go through your hosting account and generate a copy of the files that have been changed since the last backup. The script will also generate hardlinks pointing to the unchanged files from the old backup. This way, the actual process will allocate much less space on the storage units, while it will still provide a full backup of your hosting account that can be restored at any point.
Additionally, depending on the timezone of the shared hosting server, the backup script will run during the night only. The starting point is 01:00 AM, as there is less traffic on the servers at this time of the day and the backup would normally be completed before 07:00 AM. This way, the clients would not even notice the process. This gives us a 6-hour window, and in the best case scenario, the incremental daily backup should be completed within this timeframe.
However, during our proactive service monitoring, we started to notice a delay in the backup script. Our data storage provider confirmed that this is due to the I/O that has been generated on the storage units during the actual backup process, as well as a large amount of inbound data that is transferred daily. The result was the increased backup times, as the script started running for more than 24 hours and the backups were no longer considered “daily”. Our clients still had 7 backup copies of their hosting accounts, but in some cases, the backup script has been running at intervals of 2-3 or even more days.
Why was Backup Maintenance needed?
Most shared hosting providers will hardly consider this story a serious problem that requires further actions as long as they provide a decent number of backup copies to their customers. The majority of them would not include a free backup service along with the web hosting services at all. FastComet approaches backing up your website differently. Even when considering the cause of the backup delay, our main focus was to provide a fully functional backup service without limiting our clients to the resources they use. What’s more, we have the infrastructure in place as your hosting provider, to ensure your backups are as they should be: intact, usable and most of all, kept secure.
We don’t want to see you struggling to put the pieces of your website back together when something goes wrong. We get your business is your passion, your livelihood and we understand the opportunities a functional, secure and robust site can provide. That’s why we’ve poured our resources into creating one truly stable backup system.
We have previously outlined the two main reasons for the backup delay so the backup maintenance we performed was divided into two main phases. Phase one involved adjustments to our backup script to lower the I/O on the data storage units. Phase two was related to the actual data storage and the volume of inbound traffic it can handle. Our system administration and development teams were working around the clock, in order to complete both phases as quickly as possible.
FastComet New Approach To Backups
Phase one of the backup process did not take much time. Our development team was able to locate the bottleneck of the higher I/O on our remote storage and applied a fix that would lower it significantly. However, the QA team quickly determined that after the new implementation, the generated daily backups were no longer incremental. This means that the backup script started generating full backups of each client’s account daily and then transferring it to our data storage provider. Our data usage quickly grew from around 100TB to over 300TB within a week – an increase of 300% in disk usage, costs and additional delay throughout the backup process.
We started debugging the issue and we have determined that it is caused due to the environment of our data storage provider. As part of creating hardlinks, the script was using the Rsync’s –link-dest option and needs to supply an absolute path to the previously generated backup. The script then calculates the absolute path using Rsync’s readlink command, which was not available on our data storage provider’s environment, preventing the generation of incremental backups.
After a thorough review of the situation and considering the previous issues with the storage units, we have decided that it is time to scrap them. We have migrated over 300TB of data from the old solution to our new combination of large capacity HDDs and ultra-fast NVMe drives. Our test results show up to 6x better performance compared to our old storage solution. The new technology and hardware allow processing lots of simultaneous disk I/O requests, which speeds up the backup process significantly. Additionally, the new backup storage is resilient and fault tolerant, as it is configured with 3x data replication, which ensures that the backup data is highly-available.
Let’s take a look:
- It means new data storage, a combination of large capacity HDDs and ultra-fast NVMe drives.
- For you, it means 6x better performance
- It also means 3x data replication
- Our new backup ensures separate storage per shared hosting server
The result was hiccup-free and successful bulk conversion of our backup service. The migration from the old solution to the new one is now completed for all our shared hosting servers.
Performance and Testing
At FastComet we are obsessed with speed. That is why we work only with the best facilities in the world, and we choose the most powerful and innovative hardware to build our infrastructure on. We are able to backup 3 times more data, 7 times faster, compared to the old backup solution. To compare results, we were able to lower the time to take a backup of an entire shared hosting server from 30+ hours to just about 4-5 hours. This also allows us to run the backup script only during the night time when the load is really low. As the backups can now run for 4-5 hours during the night, the performance of our service will not be affected during the day, which is a major improvement.
We are still working on switching all our shared hosting servers to the new backup system, which is a process that will be completed within the next couple of weeks. As it is well-tested already and part of our environment is already using the new system, we do not expect any major issues. The only thing that you may notice is a single missing daily backup, as the script will not run for a day while we migrate the data from the old to the new system. However, this process takes less than 24 hours in this case, so the backups will start running normally again right after that.
We hope you never need to use this feature, but we all know mistakes can happen. And when they do, we’ve got your back with our new backups!