I ran across this post on DZone recently where the author of the post posed a bunch of questions related to “cloud”. I decided the only way to attempt to answer these questions would be in a series of blog posts. So here goes…

What is the Cloud, really?

As the author of the above post noted – it is certainly a buzz word (when you add computing to it) and according to the New Oxford American Dictionary you could even call it “vapor” :)

noun

1. a visible mass of condensed water vapor floating in the atmosphere, typically high above the ground.

But what is the cloud, really? I’m going to plagiarize the following definition from an article on CIO.com

“a pool of abstracted, highly-scalable, and managed compute infrastructure capable of hosting end-customer applications and billed by consumption”

But, IMHO, the above definition does not give the term cloud computing enough justice. But it comes close! Therefore let’s look at this definition a little closer:

pool – 2 or more
compute infrastructure (read machines) – could be commodity ($200 workstations – Ugly Betty style) or high-end multi-processor, multi-core (Paris Hilton style) machines. I’ll leave it at that for now and not elaborate further on infrastructure (it’s a loaded word).

Hmmm…so far we have 2 or more machines (starting to look like a cluster or group to me)

Now let’s understand the other words in this context:
abstract – meaning the complexity of the infrastructure is taken away or hidden from the user.
Or in other words, the user does not care whether the machines (hardware) are commodity or expensive, OR what flavor of Operating Systems (OSes) (could be one of the many Linux, Linux or Windows distros) are being used, or whether machines are runing single OSes or housing guest OSes (software)

This leaves us with: A cluster of abstract hardware/software (we’ll call it machines so it’s not a mouthful) hosting end-customer applications.

Going back to a few more words in the definition:
scalable – the ability to expand and handle additional load without having to change the software or business applications. In “Infrastructure Clouds” (another buzz word – I’ll touch on this in a future post), this means adding more machines or more images (using virtualization) to meet growing demands on the end-customer applications.

managed – select, plan, organize, control
i.e. some effort is required in supporting the infrastructure – a system or network administrator in most cases.

Again, this leaves us with:
A cluster of select machines hosting end-customer applications capable of growing when load on the customer applications increases.

Holy potatoes! Does this mean have we been using cloud computing all this while? To a certain extent, yes.

For example, when web-based email first came out you and I hopped happily on-board. Little did (and even today) we care about how many machines or what flavor of OS was (and still is) servicing our requests. All we care is that it is available 24/7. Perhaps we should add “Highly Available” the above definition.

Or how about that application that your company built in-house and then deployed on one server only to add another server and another and so on as the application got more and more popular. The application users do not know or care if there is 1 or more servers housing your application. Again, all they care about is that it meets their business needs.

Of course we all know that clustering means having something load-balance the load across each server based on some algorithm such as round-robin, cpu load per server, etc.

So is cloud computing plain old clustering front-ended with a load-balancer? Partly true – Clustering with load-balancing is only one aspect of the cloud. And it comes packaged with the cloud!

An integrated clustering solution

Certain cloud products such as Appistry EAF or cloud providers such as Amazon EC2 offer a load-balancing solution out of the box.

In traditional clustering/load-balancing, you would need to setup and configure a load-balancer to spread the load across the machines. This generally proves cumbersome if you are new to the load-balancer product/solution. In other words, you are spending:

  • time – researching, configuring, setting up, managing a load balancer
  • optionally, money – if you have purchased one or planning on purchasing one

By using a cloud product or cloud provider, you could save time and costs. And therefore, time and cost to market your business application is drastically reduced.

Another key point that separates cloud computing from traditional clustering is how static traditional load-balancing/clustering is. Imagine needing a physical or virtual machine to scale out your business application as it gets popular. But not being able to do it quickly because:

  1. you have to procure a machine or create an image
  2. provision the machine/image with your business application
  3. configure the load-balancer to recognize this machine as a part of the cluster and start spreading requests to it

Using a cloud product or cloud provider alleviates such issues since you will have machines/images at your disposal right away. And when you provision and deploy your machine/image the load-balancing solution that is provided out of the box will recognize the new machine/image and start spraying client/application requests to it.

Food for thought: Imagine the situation where your application does not need the machine/image you just added and you want to take a machine out of the cluster and add it to a different cluster of machines serving another business application? Elasticity, anyone?

I’ll stop here for now on traditional clustering vs cloud computing. A more detailed pros and cons in another post perhaps.

So let’s see who can benefit from this right away?

  • Companies who do not have the time and resources to invest in a IT department to manage their applications/data. Such companies could either use Amazon EC2 or some other public cloud (another buzz word – more on this in a later post) offering.
  • Software-as-a-Service (more buzz words) or business application vendors who want to stay focussed on their core business and want to have a hassle-free option of hosting their applications. Like the above, such companies stand to benefit quickly from cloud offerings.
  • Businesses who are spending huge amounts of money out-sourcing the management of their data centers to old fashioned IT Consulting firms who either host the data/applications off-site or place pricey consultants on-site. Such companies could also go with a public cloud or private cloud offering.
  • What about developers? As a developer I’ve worked on prototypes – for customer demonstrations, proof of concepts, test some theory. And during this time, if I needed multiple machines, it was generally hard to find or I would need to jump through hoops to get my hands on these resources. Suddenly having a cloud of resources certainly sounds appealing, doesn’t it! And not having to deal with load-balancers is a bonus.
  • In this post I talked about about only one aspect of cloud – clustering with load-balancing which isn’t anything new. But what separates traditional clustering from cloud level clustering is the dynamic nature:

    • The ability to add/remove machines and provision applications easily and quickly
    • The luxury of not having to deal with the complexities of a load-balancer but still have a load-balancing solution underneath the covers

    So for companies not wanting to invest time in researching, setting up clusters and managing load-balancers, but still wanting to quickly market their products, cloud is a great solution.

    In the next post in this series, I’ll talk how another characteristic of cloud – scalability made easy.