Cloud scaling considerations
Amazon’s new Relational Database Service (RDS) has generated quite a bit of buzz as of late. This move propels Amazon forward into the application services provider in the cloud computing arena. I’ve briefly written about different types of cloud services in an earlier post and outlined differences between Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), Software-as-a-Service (Saas), and IT-as-a-Service (ITaaS). As cloud-based services gain maturity and adoption, the lines between different “as-a-Service” offerings to blur as providers evolve their service offerings.
Under the covers, RDS instances are essentially EC2 images with MySQL with added services to automates the backup and scaling capabilities. Scaling, or elasticity, has been one of the lures in placing workloads in the cloud. Cloud computing will reach nirvana when compute resources are automagically provisioned and de-provisioned as workloads increase and decrease.
Today, cloud resources provides a mechanism to scale resources in a linear fashion:
- add another server… in the case of web & application servers where workloads can be divided amongst a pool of servers;
- move to a bigger server… in the case of database servers where a single server is responsible for processing all the workloads;
However, is linear scaling the right approach to servicing workload increase? I think not. Not all bottlenecks are created equal. In order for workloads to leverage the most out of cloud computing, it’s necessary to ensure that the systems servicing those workloads are maximizing the utilization of available resources.
While your cloud system may be outfitted with the latest & fastest physical components, are these being fully utilized by the OS & software stacks? Here’s a well-written technote from Facebook engineering on how they modified linux and memcached to gain efficiency in memory & processor utilization to deliver a 4x improvement in request handling within a single system.
Sometimes it’s necessary to separate functional elements of the application stack. We’ve seen that in multitier web applications were the presentation & business logic components are delegated to different application servers. We’ve also seen that in database systems where reads & writes are handled by separate servers.
As we continue to move workloads into the cloud. It’s necessary to establish a process to continually evaluate & capture limits within the system to maximize resource utilization. Simply “throwing” in another box is not always the best solution to scaling challenges.