Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] What Happens when a cloud service shuts down



On 1/20/2012 4:34 PM, Bill Bogstad wrote:
> I see "cloud" as inherently implying leasing/renting access to other
> peoples' equipment.   In some sense, it is an extension of
> outsourcing.  Except you may only outsource pieces of a long chain of
> steps rather then the entire system and there is an implication that
> you can adjust your outsourcing on the fly.

No criticism of Bil, but if I were placed in charge of a system that 
depended on being able to adjust an outsourced component or process 'on 
the fly', the first thing I would do would me to test how quickly it 
could be transitioned to another vendor or an in-house backup. I have 
seen several instances where a salesman's assurances of quick recoveries 
came to a full stop on-the-fly-in-the-ointment.

In each case, the "root cause" was the fact that the System 
Administration team did not have the training needed to prepare them for 
a multiple-failure scenario: each vendor assumed that they were not 
responsible for /anything/ but the smallest possible interpretation of 
what /they/ were legally obligated to provide, with all questions of 
integration, presentation, security, recoverability of meta-data, and 
total-time-to-repower being left to the customer. In effect, they left 
the companies that had purchased their services holding the bag, looking 
for help, and facing questions from stockholders about how they got into 
the mess.

I liken computing in the "Cloud" to the process of flying in clouds: it 
demands careful preparation, extensive training, and well-maintained 
hardware /before/ it's a reliable way to get from "A" to "B". In the 
case of pilots who have "Instrument" certification, that training 
includes the use of basic instruments such as a compass and airspeed 
indicator to navigate without being able to see anything outside the 
aircraft, since advanced electronics and autopilots can't be relied on 
during systemic failures. In like manner, using "Cloud" computing 
requires well-trained operators and managers whom have practiced the 
needed steps to recover, and are prepared to deal with multiple failures 
at the same time: I have seen something as trivial as a clip-on utility 
lamp cause a server shutdown because it added just enough electrical 
load to be the "straw that broke the camel's back", and popped a circuit 
breaker which cast an entire room into semi-darkness, at the same time 
it revealed that an apprentice electrician had inserted a "Delta" 
breaker into a panel wired for "Wye" service, thus interrupting only two 
legs of a three-phase power system, and leaving the lights in series 
with the servers and CRT's.  Some very interesting effects were visible 
on the monitors and lights during the time when the responsible parties 
waited in the almost-dark for me to dig a flashlight out of my case and 
then lead them to the door.

We recovered the data - at the time, the IS departement was  a 
soup-to-nuts organization, responsible for /every/ aspect of the 
company's IT, including having nightly backups in the safe, and we took 
a lot of pictures of the electrical panel before the electricians 
arrived, just to avoid questions from the insurance carrier.  The damage 
was limited to a few skinned fingers while we rushed around unplugging 
everything, and one failed ballast in an overhead fixture, plus (of 
course) some expensive server disks.

The moral of the story is that too many corporations switch to "Cloud" 
computing as a way to justify using junior-level employees in 
senior-level positions, and they tend to find out the hard way that 
managing a "Cloud-based" system requires /more/ expertise and training, 
not /less/.

FWIW. YMMV.

Bill

-- 
Bill Horne
339-364-8487




BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org