Backup

Feb 15, 2010

We spend countless hours designing, coding, and otherwise perfecting our websites so people can enjoy. The problem is what happens when the server goes offline? There are different reasons why a website may be offline, but for the purposes of this article we are going to discuss long term outages. Longer outages can be caused by different things, but good examples are raid rebuilds, datacenter power failures, and other extended outages.

Downtime is unlike to affect most of us much as our sites are more experiments and/or hobbies to keep us somewhat busy. The problem is what happens when a friend or ten ask you to host their website. Let's say these ten or more friend's sites happen to be business website! Downtime in the hour or more category can result in endless phones calls from those friends or worse clients. They don't care if the outage was caused by a hard drive crash, or power outage at the datacenter, or loss of Internet. They just want the website they pay you to keep online back online fast!

There are quite a few solutions to this problem, but they are all flawed in one form or another so the question is really which option is right for your (and/or your client(s) needs)? I've devised a somewhat functional method of addressing this problem for my clients so for those of you needing a plan here is my method (or madness depending on your point of view)

This method was developed for my client work which is hosted on a VPS server with private DNS server (ns1.mydomain.com/ns2.mydomain.com), all domain names hosted on my server must be pointed to these DNS servers if they wish to use the backup server. Please note you should have a good understanding of DNS and how to move sites between servers. I use cPanel servers so all references will be targeted at cPanel, if you know how to do this with another control panel system please let me know! If you can restore a full cPanel backup on your server this method should work fine for you or at least be much less complicated.

The Low Down

The key to ensuring this method works for most problems is to have servers in several locations. In my case I keep VPS servers in Chicago and Texas both with similar configurations except the backup server is setup without mail services. This is done on purpose as I will explain when I reach the flaws of this method. The Chicago server is the main server at this point in time as it has more resources and better oversea connections.

I keep this process very manual as I like to have more control over how everything is handled, there are likely methods to automate most of this if so shoot me some links so I can learn more about how to simplify things.

The overall process is two pronged, all websites are built/hosted on the main server. Once all the work is done and the site is launched I run a full cPanel backup of the account. This backup is copied to the backup server and restored. This will setup the domain name and all related zone files, the databases, and all the files and folders.

This process is repeated for every website launched so if the main server has an issue I have a working copy of every website from when they launched. These sites can't be accessed unless the domain name points to the server and to the rest of the world the copy of these sites doesn't exist. The trick now is what do we do when the server goes down?

Activating the backup server is actually very simple, in that you just need to change your DNS server's IP addresses. When you setup the private DNS server you create two A records at your registrar and then set your domain name to these records. If you change the IP of these A records to the backup server it will switch all traffic to the backup server, all the sites that are setup will come back to life as the change takes affect across the Internet. Yea, someone is likely to point out, but this can take 12-24 hours or so! Well, if you use a low TTL on your domain name/DNS server addresses this will help speed up the process as most sites will correct within a few hours.

Post Sever Switch

Now, if you are still following along you recall that earlier I noted that I make the copies of the sites when they launch. So by switching to my backup server I just rolled back all my clients websites! This is very correct and some client's may be upset by seeing an older version of their site pop up before they fixed pricing errors, changed out pictures, or who knows what updates.

If you are keeping regular backups of your clients sites as you should, then this isn't really a problem as you can take this time to restore a newer backup to your clients website. If you use a CMS for your client's this process would be as simple as updating the database and perhaps some files. If your client's site is file driven you would just update the files. The amount of time you wish to spend restoring content is entirely up to you and how long your main server is offline.

Another option is when you copy your clients site change out the content on the backup server to display an offline page. This page could be branded the same as the client's site and include items such as a Twitter feed, RSS feed, or any other desired information to allow visitors to know what's happening.

Problems

There are a few problems with this system most of which can be worked around or otherwise addressed. The first and depending on the client most important issue is going to be email. Clients rely heavily on email and not losing any emails, the method above would work great for allowing clients to get email during the outage, but there is one problem. If your client fails to check their email before you switch back to the main server, any email delivered to the backup server will be lost. True, it's still on the backup server, but most clients have enough trouble setting up their email once let alone switching between servers.

There are a few solutions to this problem the first is disable mail on the backup server. Any email sent to the clients domain name will be held by the sending mail server and retried for a few days before it's returned to the sender as failed. This method will prevent any email loss provided you restore service to your main server within a day or two, but the client will be unable to access the email during this time frame.

The second solution is to use an external mail server. You may find that some of your clients already already use an external mail server such as Google Apps or a hosted Exchange account. Some may even have locally run servers! These clients are going to be happier as their email will still function normally when you are running on your backup server. They need just need the MX records in their domain name pointed to the correct server. This should have already been setup when the site launched so when you copied their site all the settings needed got copied. I HIGHLY recommend you setup Google Apps or another hosted service for your company email, I have done this with mine which allow me to stay in touch with clients during an outage.

The third option would be to setup a dedicated mail server or two and have all email be handled by these servers. They would be considered external mail servers and setup as such, but having at least two mail servers in different data centers would be the perfect fall back solution. I don't run dedicated mail servers as most of my clients can function fine without email for the better part of a day. You will need to evaluate if your clients need this level of redundancy.

Advanced

There are some more advanced options you could look into to better handle server outages. If you are looking to make your hosting solution highly redundant consider looking into some of these options.

Hosted DNS Service - In order for an external mail server to function they need to be found by other mail servers. If your DNS server is down then this information can't be found. If your clients all use external mail servers you may want to look into a hosted DNS option instead of running the DNS server on your main server.

You could also use DNS Clustering in WHM/cPanel. You can learn more about this here: http://docs.cpanel.net/twiki/bin/view/AllDocumentation/WHMDocs/ConfigureCluster

MySQL Replication - If you want to simplify the restoration process you can setup MySQL Replication. It's kind of cool because any changes to the main server are pushed to the backup server automatically. If you care to spend enough time you can delay Replication or use it for load balancing. I am investigating this option for my clients as it removes some of my backup requirements. Just keep in mind this is a one way street, so all updates must happen on the main server. MySQL Clustering may be a better option if you want the ability to update either database.


Related Articles

Backup a MODx Site

Prepare your website for a disaster before you lose it all. A few different methods to backup your MODx driven website. Don't…

© 2008 - 2018 AMDbuilder. The views expressed on this site are mine alone.