I am of the generation of “always on”. I don’t remember a time when the Internet did not exist, and I, therefore, take it for granted that the Internet is always there when I need it.
So, you can imagine my dismay when I could not access a Website on the first day of a new job, when I was working remotely. Happily working remotely from a coffee shop, as was the norm of this company, I quickly got on the wireless network and entered the URL of my employer’s site to log in. Only instead I got an error message. The next few minutes I spent troubleshooting – double checking the Website address, trying a different browser, trying to go online from a different computer, etc. But none of these worked. The Website and actually several sites I tried were unavailable.
Without access to these sites, I could not do my job, so I embarrassingly composed an apologetic email stating my inability to work for the day. The one thing I could get to work was Gmail (my other Web email accounts were not).
Little did I know that this was not user error but rather a massive outage by Amazon Web Services that led to the outage of thousands of Websites, including those I was trying to access. In the click of a keystroke, the failing of Amazon Web Services in April of 2011 forever changed my childhood belief that the Internet always worked.
You can understand in a non-technical sense why I felt this way. In Seattle, Internet access is everywhere and dependably constant. Hotspots all over feed the need of immediate online access. In the offhand chance a website did not work there was a simple solution. Reloading the page or reconnecting to the internet solved a majority of the issues. It had not been until this day that I realized there could be something more than a local issue.
While I am not an IT person, as evidenced by my Internet connection troubleshooting skills, I recently have learned a great deal about the Internet and what increasingly is called the “cloud.” I now understand it was the cloud that failed me on my first day on the job. While the cloud, like the Internet backbone it is built on, is often described as ubiquitous and always there, behind every cloud is a physical data center. Cloud computing has driven a build out of data centers at a scale we have never seen before, with millions of servers running and fans humming to cool them across the globe. This massive infrastructure is invisible or “virtual” for most cloud consumers.
Invisible infrastructure is not new, and in fact, most of us have lived with it for many years. For example, power and utility lines are underground, creating the illusion of invisibility. The only time this invisibility cloak comes off is when the infrastructure stops working, like when we have a power outage or the Internet is down.
What I have learned since this major Amazon cloud “crash” is the importance of really understanding the reliability of any cloud platform, and the importance of evaluating the infrastructure and service level agreements behind the cloud. As centralized data centers, these clouds can be impacted by regionally specific disasters, as seen in several recent outages as recently as last month, this past June and in April 2011.
While I learned a lesson about not always trusting the cloud, at least I did not have a major data loss or lose millions in revenue from this cloud outage. And, as great as it would be to continue living in ignorant bliss that the cloud is always on, history has proven this to be a foolish idea.