1 Comment

The following post is the first in a series on cloud computing and the enterprise by John Ellis, Chief Architect for vCloud at BlueLock.


As a software architect, developing solutions in the public cloud has made my life much easier. I can spin up new servers in mere minutes and occupy only ethereal, virtual real estate. I can only imagine how huge of an eye roll I would have received from a capital expense report spanning fourteen physical servers – but today I can spin up fourteen virtual servers and never leave the comfort of my operational budget. 


Scalability in the Cloud

The move from physical to virtual makes economic sense by casting aside physical hardware, yet it also makes design and development sense when considering infrastructure architecture. Since the number of servers you can provision is no longer bound by the physical size of your server closet, a whole new way of scaling your infrastructure can be unlocked.

Public cloud hosting providers such as BlueLock offer a unique value to the overwhelming majority of organizations that develop their own software today, regardless of whether the software is for internal use only or offered as a solution to the outside world. To really build an infrastructure that can efficiently scale in the cloud, one should look past the traditional methods of horizontal scaling or vertical scaling. Efficient infrastructure scaling in the cloud is achieved by performing both horizontal and vertical scaling, using what John Allspaw of Flickr coined “diagonal scaling.” 


Beefing Up with Vertical Scaling

Vertical scaling is the process of beefing up a server by adding more CPUs, more memory or faster disks. Additional CPUs may help batch jobs execute faster. More memory may allow your local cache regions to grow and fetch more data quickly. Faster disks may reduce I/O times and speed random access times. Scaling vertically in this way allows you to speed up individual applications and single threads without having to fight with the latency or coordination overhead associated with massive parallelism. Depending on the nature of the application, however, adding more hardware can quickly have diminishing returns. At a certain point adding another ounce of hardware doesn't provide any performance benefit, leaving us to find another way to scale our system and add more throughput. 


Multiplying with Horizontal Scaling

Horizontal scaling grants us more throughput at the cost of complexity. A simple example is a web site that just serves up static content; if one server can handle 10 page requests per second, then load-balancing between two servers should provide 20 page requests per second. Things get a bit more sticky as you farm out work to application servers that might need to manage some sort of stateful task, but the theory is still the same. Adding more concurrently running servers empowers you to execute more concurrent workloads. 


Growing Large with Diagonal Scaling

Scaling diagonally by enabling both vertical and horizontal scalability is something unique about clouds powered by VMware vCloud Director. No longer do I have to worry that my server is too small or underpowered. Instead, I can dynamically grow my servers until I hit the limit of diminishing returns. Once my servers can grow no further, I just clone the same server over and over again to handle more concurrent requests. This kind of flexibility is invaluable – it removes the fear that we might make the wrong scalability decision at the onset of our architecture layout. It's nearly impossible to truly understand how an application will need to scale in the future – especially when your developers haven't written the application yet! 


Designing to Scale: The Creation

As an example let us envision a simple, n-tier web application with a HTTP server and a second, Spring-powered tcServer. We may initially size the HTTP server to have only 512 MB of RAM with 2 vCPUs, while the application server resides within 1 GB of RAM and 1 CPU. Our application is young, no one is really using it aside from potential investors and (since we picked a fantastic public cloud provider) we just want to pay a minimum amount to start. 


Building to Scale: The Launch

Our application goes into public beta and users begin to sign up. We bump up the number of concurrent threads our HTTP server is allowed to spin up and hot-add another 512 MB of RAM. CPU usage on our application server goes from 10% to 50%, so we add another vCPU to allow our thread pool to spread out a bit.


A few weeks later our developers add local caching to our application, so we expand the memory from 1 GB of RAM to 16 GB of RAM. The servers are now larger and handling requests nicely; in fact the application really doesn't need to use more than 12 GB at any point in time. 


Growing to Scale: The Rush

One month later our application is about to be featured on a prominent talk show. The application needs to handle five times the number of concurrent users…before lunch. Adding additional hardware won't help us scale to the meet the throughput we require, but luckily our savvy developers wrote our application so that multiple instances can share sessions and work together. Within minutes the application can horizontally scale by cloning our HTTP and application servers six times over. Since our clones are exact duplicates of our existing servers with brand new IP addresses, there is no need to re-configure the servers in order to have them ready to go.



Ready, Set, Grow

Working with a public cloud provider allows your infrastructure to scale in a way that best fits your application. There is no “one size fits all” strategy to infrastructure architecture – you need to expand alongside the unique personality of the applications you host. Infrastructure powered by a VMware vCloud Datacenter partner can help take away the worry about locking yourself into a single scalability pattern – letting you grow along with your application's own unique needs.