Cloud Computing & Hosted PBX News – Dallas, TX
Cloud Computing & Hosted PBX News – Dallas, TX

When Cloud Computing Goes Wrong

cloud-computing-failure-outageTechnical things go wrong. So what should businesses think about to ensure reliable and consistent operations with an added layer of complexity? The first step is recognizing that things will go wrong. Whether operations are in an in-house data center, an external commercial collocation data center, or in a hybrid cloud arrangement with workload split between in-house and cloud, the principles are the same.

Cloud Isn’t New

No matter what marketing would have us believe, cloud is not a new concept. It is simply remote hosting of some or all of the workload in a data center, and is not dissimilar in principle to 1960’s time-sharing services. The difference between 1964 and 2014 is the speed and data capacity of fiber optic cables, which open up a whole host of new possibilities to business owners.. But the principle remains the same as do the principles of resilient design. As some or all of the workload can be hosted remotely, the most critical new consideration is the communication between the user and data centers where cloud operations take place.

Securing The Right Data Partner

It is important that businesses choose a high quality data center, with strong data communications and cloud experience to help minimize risks. Any data center which says it has never had an outage of any sort is either too new to have a track record or is not training its sales staff to be honest. Even major players, with more money to spend than most businesses can dream of such as Google, Facebook and Amazon have experienced very public data center outages in the last five years.

Most recently in June this year, Microsoft Office’s 50 million users in the US experienced a nationwide two day outage. Operations managers and architects need to carefully ask the right questions to find out the truth and work through the concepts of automatic fall-overs or manual switching in the event of something going wrong. Ultimately, it comes down to choosing a data center that you trust.

Moving The Right Workload

Choosing the right workload to move to cloud is also important, especially in the early days when in-house IT staff have less experience of cloud operations. In general, workload which has infrequent, small transactions which are not latency-critical works well in cloud. A CRM system is a good example, where a submission of a visit report or the retrieval of a customer phone number is infrequent, small, and not time critical.

On the other hand, voice telephony, which is a continuous stream of time critical data, is not a good application to move to cloud, except for specialist suppliers who know how to do this and will be located in carrier-rich, carrier-neutral data centers to get the connectivity and diversity they need.

Automatic switching of IP address allocations is a particular problem which needs careful thought. The difficulty of automatically detecting a failure and instantly transferring all the IP addresses to another set of equipment in another location leads many smaller installations to accept a short outage and transfer the addresses manually.

In resilient or safety critical design, every element must be considered and there is one key question which must be asked – “what will happen if this element fails?” The design can then be changed so operations will continue without interruption. If that is not possible, then a plan has to be put in place to deal with the effects of a failure that cannot be mitigated.

Testing Is Key

Continuous testing is essential, as is reconsidering the effects of each potential failure anew each time the system design or architecture is changed. So is rehearsal and practice of both automatic fail-overs and manual procedures to deal with failures. At least once a year, every likely failure should be forced to happen, so that its effect on the overall system operation can be checked. This is one of the main principles of ensuring reliable, continuous operations, and is the same whether a business is operating an in-house data center or a remotely hosted operation in a data center in a cloud environment.

Author: Roger Keenan
Source

Brian