In my previous article, I have extensively discussed moving your disaster recovery to the cloud. I have said that the cloud is a viable option to host your disaster recovery site, rather than physically building a disaster recovery site. I also emphasized that cloud is not a one-size-fits-all solution. Now I will discuss what issues need to be considered when moving your disaster recovery to the cloud.
You build a disaster site to make sure that your business is up and running when the unexpected happens. In different terms, you eliminate your single point of failure for your business, which is your current datacenter. To access your cloud disaster site, you need the network connection, which means that you also need to eliminate the single point of failure from your network connection. So, you need to work with more than one Internet Service Provider (ISP) to connect to your disaster site. In addition to the disaster scenarios you will also benefit from load balancing during normal operations.
Next, you need to plan which applications need to be available in a disaster scenario. Once you list the applications, you need to prioritize them. It is obvious that you need to make your line of business (LOB) applications available in a disaster site but do you really need the video conferencing application when everything comes down? This is not to say that communications is not important, people can use the instant messaging (IM) applications, mobile phones and landlines after a disaster; but to say video conferencing is not as essential as other channels. Considering your infrastructure, you will see that there are many applications that are not so urgent to be considered in a disaster scenario.
This brings us additional considerations: is your cloud vendor capable of running your LOB applications? If they cannot host your environment or if they do not have the resources, there is no reason to go ahead with the cloud vendor, change it. If the vendor is capable of hosting your environment, then test it. This is to make sure that there are no vendor-specific issues when the disaster strikes. And plus, the test ensures that the availability and the performance in the cloud is similar to your on site levels.
Since you are pushing non-essential communications to the second level, you have to make sure that the regular communication channels are available. If you base your disaster recovery scenarios on mobile data and if a storm like Sandy takes down the cell phone towers, your disaster recovery will be worthless. Although this sounds like no-brainer, you need to ensure that regular or legacy communication channels are there to be used.
When speaking of keeping legacy communication channels, you also need to keep the legacy installation methods available. Again, this may sound like a no-brainer, you need to keep CD, DVD or USB installations available. Keeping the installations on your datacenter will not help you in a disaster scenario. The installations need to be in sync with your disaster site. Plus, there can also be scenarios where physical copies will also help, such as a forgotten legacy application still in use with no download link available.
To keep costs down during normal operations and exploit the scalability of the cloud, you will possibly keep the cloud disaster site to a minimum during normal operations and scale it up when disaster strikes. Although spinning out a few servers is easy, the configuration of the servers takes time. Plus, there are the issues of server farms, such as Sharepoint or Terminal Server farms, which translates into making the same tedious configuration over and over. In such a scale-up scenario, it is essential to keep a deployment service in place. This can be System Center Configuration Manager for Windows or Network Installation Manager for AIX, which will automate your installation and save you the precious time you need in a disaster.
And finally, do not think that your disaster site is a second importance level datacenter. You need to ensure that there is at least one trained and high-skilled personnel working on the disaster scenarios. It is important to keep this personnel at the same place with IT staff but away from daily operations. When he is working with the IT personnel in day to day operations, he will be able to understand what is going on and will not lose the perspective of the disaster scenarios. If you employ him with regular daily operations, he will inevitably lose the perspective of disaster. If you keep him separate from the daily activities, he will not be able to understand how things work in detail and will not be able to keep your disaster site to your business’s daily requirements.
Lastly, businesses tend to think the disaster sites as set-and-forget and/or give them less importance. They tend to think that they will be able to keep the business up and running with a little extra effort when disaster strikes. Although “we have a disaster site” is psychologically calming, it is counter intuitive to expect the same level of service from something which you give less importance and less investment.
We all know that the disaster site is the only resort for the businesses to continue their existence. The first step has to be taken by the executives. They need to pay the attention to their after-disaster scenarios in detail to make sure that the business continues to operate even after the unexpected and the inevitable happens. Otherwise, the executives have no right to expect to be paid by a business that lost all its IT infrastructure in a disaster, when they also contributed to the disaster by not paying enough attention. And from the business perspective, a business that has lost its information technology infrastructure means that it has to start from scratch, if it has the financial strength.
- Featured image: http://blog.mcpc.com/
- Inline image: http://www.sidekickinc.com/