What is Cloud Storage?

Sure, you’ve heard about “the cloud” for years now, but what the heck does that really mean? Before marketers got ahold of the word, “cloud” actually meant something. Cloud computing is a distributed approach where the computing work is spread over a number of connected devices. The advantage of this approach is that you gain economies of scale, reduce vulnerability, and allow for rapid expansion of resources when necessary. No single system is responsible for any one single operation, so devices in the cloud can come on and go offline as needed.

Unfortunately, in practice, the term “cloud” is now used whenever a company is offering anything as a service over the internet, regardless of the underlying computing architecture. That is especially true for cloud storage, where all of your data may very well be sitting on one device in a data center somewhere in the high desert. Not very cloudy, is it?

Symform actually stands out in this space as a unique provider of what we would consider closer to “true” cloud storage. Before you think we are just a bunch of Seattle elitist hipsters (we are totally from Seattle, but not very hipstery), let me explain why. Symform’s revolutionary peer-to-peer cloud storage architecture spreads user data over a worldwide network of independent devices. As users join the network cloud storage capacity is added. As they leave it goes away. Symform acts as the central orchestration service, telling each device where to go to store and retrieve data. But there is no giant data center behind our cloud.

Of course this approach means we have to do more work on the backend than our peers, but the advantages are that Symform can be faster, cheaper, more secure, and greener than any other cloud storage out there.

The Cloud is Always On – or so I thought

I am of the generation of “always on”. I don’t remember a time when the Internet did not exist, and I, therefore, take it for granted that the Internet is always there when I need it.

So, you can imagine my dismay when I could not access a Website on the first day of a new job, when I was working remotely. Happily working remotely from a coffee shop, as was the norm of this company, I quickly got on the wireless network and entered the URL of my employer’s site to log in. Only instead I got an error message. The next few minutes I spent troubleshooting – double checking the Website address, trying a different browser, trying to go online from a different computer, etc. But none of these worked. The Website and actually several sites I tried were unavailable.

Without access to these sites, I could not do my job, so I embarrassingly composed an apologetic email stating my inability to work for the day. The one thing I could get to work was Gmail (my other Web email accounts were not).

Little did I know that this was not user error but rather a massive outage by Amazon Web Services that led to the outage of thousands of Websites, including those I was trying to access. In the click of a keystroke, the failing of Amazon Web Services in April of 2011 forever changed my childhood belief that the Internet always worked.

You can understand in a non-technical sense why I felt this way. In Seattle, Internet access is everywhere and dependably constant. Hotspots all over feed the need of immediate online access. In the offhand chance a website did not work there was a simple solution. Reloading the page or reconnecting to the internet solved a majority of the issues. It had not been until this day that I realized there could be something more than a local issue.

While I am not an IT person, as evidenced by my Internet connection troubleshooting skills, I recently have learned a great deal about the Internet and what increasingly is called the “cloud.” I now understand it was the cloud that failed me on my first day on the job. While the cloud, like the Internet backbone it is built on, is often described as ubiquitous and always there, behind every cloud is a physical data center. Cloud computing has driven a build out of data centers at a scale we have never seen before, with millions of servers running and fans humming to cool them across the globe. This massive infrastructure is invisible or “virtual” for most cloud consumers.

Invisible infrastructure is not new, and in fact, most of us have lived with it for many years. For example, power and utility lines are underground, creating the illusion of invisibility. The only time this invisibility cloak comes off is when the infrastructure stops working, like when we have a power outage or the Internet is down.

What I have learned since this major Amazon cloud “crash” is the importance of really understanding the reliability of any cloud platform, and the importance of evaluating the infrastructure and service level agreements behind the cloud. As centralized data centers, these clouds can be impacted by regionally specific disasters, as seen in several recent outages as recently as last month, this past June and in April 2011.

While I learned a lesson about not always trusting the cloud, at least I did not have a major data loss or lose millions in revenue from this cloud outage. And, as great as it would be to continue living in ignorant bliss that the cloud is always on, history has proven this to be a foolish idea.

The Data Center Behind the Cloud

For months, we have been talking about how cloud computing is driving the biggest build out of data centers this world has ever seen. In fact, I’m speaking on this very topic next week at Data Center World. Every large internet company, as well as many enterprises, is driving its own build out, perhaps creating economies of scale for itself, but adding to the massive footprint around the globe of football-field size data centers that require huge amounts of power, bandwidth, cooling, and security.

New York Times author James Glanz has set this debate on fire with his recent articles on the power-hungry Internet and data centers driving this.  An earlier article touched even closer to home for us in Seattle, when he talked about how IT giants, like Microsoft, are building huge server farms in the farmlands of central Washington State and sucking its cheap hydroelectric and windmill power.

Today, cloud computing-based data centers account for nearly 2% of North American energy consumption and 2% of the world’s carbon footprint.  Sure, that pales in comparison to transportation, which is like 30%, but it’s still a massive amount of power and impact.

My point is not to argue over percentage points or to dismiss the need for data centers.  The reality is we will always need some level of centralized infrastructure, both within our corporate networks and on a larger scale.  My concern and bailiwick is that we literally cannot build data centers fast enough to store all the digital data we are creating nor should we want to.  And most of the servers and storage appliances in these data centers are vastly underutilized, with unused capacity running anywhere from 30 to 80 percent of drive space.

Sure, this argument is a bit self-serving since we are a decentralized, distributed system.  But why did we build our cloud backup system with this architecture?  Because we saw the trends around exponential data growth, high cost of cloud storage, low cost of local drives, and the drive toward more centralized infrastructure.  We just believed there had to be a better and cheaper way to store and backup all this data.

And we’re not alone.  Distributed systems, from something as mainstream as virtualization to more cutting-edge multi-core processing to peer-to-peer telecommunications and many other examples, are taking hold across the IT ecosystem.  Big data solutions are leaning on distributed and decentralized processing to manage the large data sets, because they literally cannot be managed with a single, centralized system. Skype used this architecture to build the largest voice and video network in the world, without any centralized infrastructure.

While data centers are not going away, we have a lot of work to do to ensure their efficiencies and capacity utilization.  But in the meantime, the industry needs to make a mental shift to think about our infrastructure in a new way, in which we better leverage the inherent decentralized architecture of the Internet and all the devices that sit on its edge rather than creating more fortified data centers for our individual or single corporate use.  One of the biggest challenges is getting this distributed model to scale, and for people to “trust” its security (I believe it actually can be more secure).  Skype did it.  And we are building out perhaps the largest secure “virtual” data center in the world.

I’m glad to see this debate go front and center.  I agree with Julie Bort in his defense of Facebook and data centers in that a lot of really smart people are working on this issue.  But frankly, a whole lot more work needs to be done, and if this current debate drives that, it can only mean goodness for our capital pursuits, the Internet and the environment.

Why Your Cloud Provider’s Monetization Impacts Your Privacy

In an earlier blog, I analogized cloud services to the traditional broadcast and entertainment industry, particularly around the ability for you to create content and experience a huge virtual world.  There is also a strong comparison in the requirements and heavy infrastructure costs for cloud service providers, and the ongoing expense for hosting and delivering the “content” as well as building on it (just like the broadcast industry must invest in doing sequels or series episodes).

Not surprisingly, many consumer-facing cloud services are borrowing heavily from the broadcast world in another way — their business model, or specifically how they charge and monetize for their services. In this market, there is a huge focus on customer acquisition costs and retention. Like the Nielsen Ratings for broadcast, the cloud services world focuses on UVs or unique visitors, needing to drive as much traffic to their sites and content as possible.

In this model, rather than charging the customer for the content, advertisers pay for getting the attention of those customers.  In fact, “FREE” is probably the most powerful word in the cloud.  So much so that we as an industry has even coined the term “FREEMIUM”, which is when the bulk of the users get the basic service for free while a small fraction of the users upgrade to the premium version for a subscription fee.

Most of us intellectually get that nothing in this world is really “free”, but we still are drawn to it. What it suggests is that we are willing to do any other form of value exchange as long as we don’t have to open our wallets.  Most of us seem to place much less value on things like time and attention, personal information, our social or professional network, etc.  So, just like the broadcast industry, we accept advertisements being thrown at us for the free content, and those of us that want to get the ad-free version, we open our wallets and cough up a few dollars.

Because these services make their revenue via advertising, it is in their best interest to learn as much about you as possible, in order to tailor the ads you see and drive the optimal value for their advertisers. For example, when I search on Google, it knows I’m in Seattle and shows me Seattle ads. Groupon targets me the same way.

Conversely, business-facing cloud services are employing more traditional subscription models to drive their economics and monetization. The reason is fairly simple. Unlike consumers, professionals and businesses expect to pay for the service they receive and in return receive a certain quality of service and service level agreement around reliability, security and privacy.  Even though most business customers pay for the service, the FREEMIUM model is also becoming popular here to get customers in the door and trying the service, thereby lowering the friction and customer acquisition costs.

Just like other areas of business IT, the cloud is experiencing the “consumerization of IT”.  The good thing about this trend is that business users are starting to expect an experience and ease-of-use similar to consumer-facing services.  The down side is the blurring between a service provider’s underlying business model, as consumer-facing monetization models (and thus associated lower reliability, privacy and security) bleed over to the business side.

Consumer-facing service providers have the DNA of collecting and analyzing as much user information as possible to help advertisers (their real customers) target the users to maximize ROI.  Business facing service providers are expected to deliver on a fairly different level of service in terms of reliability, security and privacy.

Google’s recent announcement of G-drive is a case in point here.  If you view the service as a consumer-facing service, it is a very aggressively priced offer, where the terms of service enable Google to subsidize the price by giving itself access to the user’s information.  This makes complete sense for Google as a consumer-facing cloud provider that generates the majority of its revenues through advertising.

However, the same service used by business users would not be appropriate if their security and privacy expectations aren’t aligned with Google’s terms of service.

Unfortunately, the onus is on the subscriber to read the fine print and understand where a cloud service provider is drawing the line.  The broadcasting world created a rating system for us to understand the potential risks of watching certain content, but we have yet to create a standard for evaluating cloud services and quickly understanding what it means for the security and privacy of our personal information or data.

How the Cloud is Just a Really Smart TV

In one of my last post about cloud equals service, I observed that Cloud is simply a short hand for delivering software as a service over the Internet. While perhaps simple in concept, the cloud has an amazing impact on how we consume technology and how we pay for it.

First, let’s look at how we got here from traditional IT and broadcasting services to today’s cloud-based software services.

The traditional IT product delivery model borrowed heavily from classical physical goods commerce. This was because traditional IT depended on the purchase of physical computing hardware customers would install and configure for their solution needs.  This model had been so engrained in the market that even when the independent software industry arose with a new product delivery model via a stream of bits sitting on flimsy floppy drives (eventually CDs & DVDs), the software vendors put those tiny things into ten times larger “shrink wrap” boxes to make the customer feel like they had bought something big and valuable.

With cloud, all this has evaporated.  It is like you turn on your TV and start consuming the software services.  And that isn’t just a good analogy, it is exactly what is happening.

The Internet has turned your computer into an incredibly smart television.  Internet addresses (instead of signal frequencies) represent the new “channels” delivered to you through your computer screen using the browser or a richer client application (e.g. apps on iPhone/iPad, Mac, Windows).  The new channels are far richer and smarter.  Not only can you experience the “content” created by the channel providers (web sites) but you can create, store and distribute your own “content”.

And “content” is not just limited to media, such as music, videos and movies, as it used to be on traditional television.  This television delivers anything you can imagine – it is a whole virtual world at your fingertips.  You don’t need to go to libraries to research anything, you just “Google” it.  Want to read a book, “download” it from Amazon or many other sites.  Want to talk to someone, just “Skype” them.  Want to send a note, just “email” it.

You can also create your own stuff on this TV – documents, music, videos or entire channels (web sites, blog sites, etc.).  Facebook is just a highly personalized social TV channel – you know what is going on in everyone’s life (sometimes whether you want to or not!).

There are other parallels to be drawn here.  The “back-end” of cloud services resembles the traditional broadcast and entertainment industry as well.  Just like creating movies and TV/Radio programming requires significant upfront investment, so does creating a cloud service.  Furthermore, most cloud providers must invest heavily in data center infrastructure to power their cloud solutions.

As such, just like we can end up paying quite a bit for decent television experience, due to the inherent costs of creating and delivering cloud services, we also pay a lot for cloud services, starting with basic Internet access fees itself (and let’s not forget the mobile service charges if you truly want anywhere access).

In my next blog continuing on this theme, I’ll dig in more on the cloud economics and monetization.

Symform and KOMO team up for CloudCast Radio Series

This morning we kick off a new partnership with KOMO Newsradio with the launch of the CloudCast radio series.  These 60 second spots airing Monday, Wednesday and Friday mornings during rush hour will include definitions, real world examples, helpful tips and other information about cloud computing and cloud-based solutions.  The goal is to make these straight forward and educational, not tech speak

You can hear the CloudCast program on both KOMO Newsradio 1000 AM/97.7 FM and Smart Talk 570 KVI. The program is also available for live streaming at http://www.komonews.com/radio/listen, and we’ll have podcast versions on the Symform website.

I’m excited about this program for many reasons.  One, it’s a great way to help educate the Seattle business community about cloud computing and how they can think about it for their organizations. Second, this fits perfectly with our mission at Symform to create community around cloud storage and backup.  Finally, I’m excited, selfishly, to be back on the radio, even if it is only for a few seconds, as my previous life as a radio broadcaster was one of my favorite jobs.

Here’s the interview from this morning on KVI 570 Smart Talk:

Cloud is just another word for Service

In my last blog, we discussed the advent and hype of cloud and explored centralized and decentralized approaches to providing cloud services.  In ending, I observed it was intriguing why services such as email and Web that were originally designed to align with the core Internet principle of decentralization were today mostly centralized.

The reason for this is actually pretty simple.  All early email and Web providers were software vendors: Lotus, Netscape, Microsoft and others.  Each viewed their primary objective to deliver software that customers would buy and deploy and operate on their own – not delivered as a service, which is what the Internet and the cloud inherently enable. There lies the problem.

It takes a special kind of mindset to turn complex technology into something useful so the average user or consumer can benefit from it. That mindset is “services orientation”.

Software companies, and hardware companies for that matter, provide technology components.  They don’t have the services DNA.  That is why early email followed the traditional software model.

Hotmail transformed email in this regard.  This early Web email application enabled people to get email as a free service.  Prior to that, email was mostly used by those at major corporations or universities, and the biggest email provider was AOL – a paid service requiring you to download a piece of software onto your computer.  Today, everyone has email, and typically multiple accounts, including my 70 year-old mother in law!

As another example, take the Internet itself.  The equipment was there, but it wasn’t until Internet Service Providers came into existence and took on the burden of managing this complex back-end infrastructure that even the most non-technical end user could surf the Web.  If Internet Service Providers hadn’t come into existence, would the Internet have been as successful as it is?

The services revolution started in the 1990’s when Lou Gerstner took over dying IBM and transformed that mammoth company from being a technology provider to a services company.  It still builds technology but it mostly sells services.  IBM’s growth over the last 20 years speaks for itself.  IBM started this revolution in the pre-Internet boom, so it did it the old school way – using human capital.  IBM Global Services literally took over running IT departments for large corporations and turned them into service organizations.

This service bug is really infectious.  Everyone now wants the service-on-the-tap.  Even software developers!  Infrastructure-as-a-service providers such as Amazon, Softlayer, and Rackspace are showing that service orientation helps you create a valuable business and own customers at every layer of delivery value chain.

Cloud is just a new buzzword for services. It adds the notion that it doesn’t matter where the hardware or software components reside, because the service is now delivered to you over the Internet.

Five Ways to Evaluate if Your Cloud is Green

Ah, the cloud . . . with images of blue sky, white puffy shapes and pure, clean air.  For IT, the cloud was to save us from the evils of infrastructure build out.  But alas, it turns out, the cloud is not so pure after all.  In fact, the cloud itself is driving quite the massive build out of global data centers.

Every day, the news is filled with plans by Facebook, Google, Amazon, and many others for new, huge data centers.  Even with highly efficient servers, power-saving cooling systems and ingenious ways to leverage natural resources, such as locating the data center in the arctic, data centers are still  creating a large carbon footprint, hogging electricity and energy resources, and wreaking havoc on the environment.

The EPA raised the flag first, noting that some 2% of North American electricity consumption comes from data centers and servers.  And now, Greenpeace, the warriors of green, have come out with a comprehensive evaluation and scorecard of cloud computing vendors.  It’s not pretty, and shows most companies are not going beyond the bare minimum.

With some input form the Greenpeace research and our own expertise, we’ve come up with five ways to evaluate if your cloud is green:

  1. Energy Consumption:  How many watts is your cloud vendor or your private cloud consuming?  Google consumes 260 million watts continuously; enough to power a city of 200,000 people.  Does your vendor employ virtualization, distributed systems or other efficiency measures?
  2. Electricity Grid Strain:  Is your cloud powered by a data center that is located in one of these growing central data center areas, where multiple data centers are clustered in a typically rural area?  If so, it could be putting a huge strain on the regional electricity grid.
  3. Pollution Contribution:  Just as behind every cloud is a data center – behind every data center is a power station, and ideally, your cloud is powered by a renewable energy source or at least a clean one.  However, Greenpeace reports that many cloud vendors are relying on less than clean sources, such as coal.
  4. Rating on the Greenpeace Score Card:  Greenpeace just came out with new research on “How Clean is Your Cloud” and in it, they evaluate 15 cloud computing vendors and grade them from A to F on key green criteria. In the report, Greenpeace notes that   Amazon, Apple & Microsoft are rapidly expanding without adequate regard to source of electricity and rely heavily on dirty energy to power their clouds. Facebook and Google got some greenie points from Greenpeace for prioritizing access to renewable energy.  And Akamai is first company to report on its carbon intensity.  You can see the complete scorecard here
  5. Use of Decentralized or Distributed systems:  Look for solutions that don’t rely on data centers at all.  Skype is a great example.  And so is Symform, with our peer-to-peer storage network

Power of the Decentralized Cloud

Amid the hype and reality of cloud computing, we are witnessing the largest build out of centralized data center infrastructure in history. Due to the increased demand for cloud-based services and applications and the incredible growth of digital data, cloud providers are spending billions of dollars to keep pace, and attempting to leverage geographic areas that have an abundance of natural energy or cooling power, such as the Arctic. It’s ironic that the cloud is driving this massive build out, since for many companies, the cloud enables them to minimize infrastructure purchases of hardware and software.

With the growing maturation of cloud computing, two distinct classes of cloud applications have arisen:  centralized and decentralized.

All of the facility build out I am talking about is powering centralized cloud applications. And in spite of the Internet being a truly decentralized network, surprisingly little energy or dollars are being spent on enabling decentralized cloud applications.

Most applications we see on the Internet today are hosted centrally in one or more data centers and take very little advantage of the capabilities on the edge.  Even some previously decentralized models, like email (SMTP), which was designed to be a decentralized application, have evolved into a centralized model, with mega email providers such as Hotmail, Yahoo and Gmail – all of which are built on a centralized data center infrastructure. Business email is still fairly decentralized, although we do see a definite trend towards centralization there as well.

Interestingly, the same can be said about the Web (HTTP) itself.  The word “Web” was intended to define a true mesh across the Internet.  It is a mesh but a highly lopsided one, with the majority of Web Servers hosted across a few data centers.

Skype is one of the few decentralized cloud applications we see in widespread use.  While some users are not aware of this, Skype is a true peer-to-peer network, which routes communications through the bandwidth of network users.  In doing so, Skype has built the largest voice and video communications system without building much centralized infrastructure.

I’m a big believer of decentralized systems in general, so it’s not surprising that we built Symform as a decentralized, peer-to-peer cloud storage network. As I mentioned, the power of the Internet comes from it being decentralized.  It spreads organically and through natural economic drivers.

The Open Source movement is the same type of decentralized model, where no one person or organization is ultimately in charge, and yet, this is where its power and scalability thrive, as all members want to contribute to the movement and, in return, benefit from others’ contributions.

It is worth observing that all decentralized systems do have a central core of some sort that everyone must align to.  Standard specifications like TCP/IP, HTTP and HTML are at the core of the Internet.  Linux is at the core of Open Source movement. Skype owns and runs the core of its decentralized system. What ultimately made these systems successful was getting more and more people to start using the core in different ways to solve their own problems. What started as viral movements became revenue generating solutions and companies.

We should abstract this model and look across our IT stack to see other areas where decentralization can help drive cost savings, increased scalability and ability to better utilize our existing infrastructure, rather than building out more data centers.

And we should examine what is driving centralization of services like email and Web when they were designed to be decentralized? That will be the topic of my next blog post.