Scalability Part 1: Up And Out

Everyone wants to grow. Whether that’s serving more customers, expanding into new markets, or taking care of internal IT for more employees, growth is a measure of the health in business, economies, and nature itself.

Growing your IT infrastructure can be a tragedy of riches. As you’re successful in your endeavors, you need to move from shared hosting to dedicated, from dedicated hosting to cabinets, from cabinets to cages, and from cages to multiple data centers.

Whether or not you’re able to keep pace with your business efficiently and effectively comes down to whether or not your infrastructure is scalable.

In an ideal world, planning would be simple and linear, and adding twice the amount of capacity would require twice the hardware. Reality is never so simple. On the one hand you may be able to gain additional efficiencies as you scale, and on the other there’s significant additional complexity with which to contend. Let’s dive into the basics of what it means to scale up and scale out.

Although there are a dizzying array of options for growing your IT base—traditional colocation, managed servers, *aaS, and the cloud—most things scale in one of two ways: up or out.

Scaling Up

Scaling up (also known as vertical scalability) is the easier, “organic” method. If you need twice the capacity, you buy a server that’s twice as powerful:

We’re scalin’ on up!

It’s very predictable and simple to maintain. However, scaling up to ever-larger hardware poses some serious problems.

The larger a server gets, the more expensive it becomes. It’s more difficult to design a ten-core processor than a dual-core, and it’s more difficult to create a four-processor server than one with a single CPU. As such, the cost for a given amount of processing power tends to increase as the size of the server increases. Eventually, as you reach into the largest servers, the number of vendors decreases and you can be more locked into specific platforms.
As the size of the server increases, you’re placing more and more computing resources into a single basket. What happens if that server fails? If it’s redundant, that’s another large node you have to keep available as insurance. The larger the server, the more you’re exposed for failure.
Most systems only scale up so far before diminishing returns set in. One process may have to wait on another, or has a series of tasks to process in sequence. The more programs and threads that run sequentially rather than in parallel, the less likely you’ll be to take advantage of the additional processor power and memory provided by scaling up.

Scaling Out

The alternative to scaling up is scaling out (also known as horizontal scalability). Rather than buy progressively larger servers, you buy additional servers:

To the east side!

Web servers are the classic example. If you set up several web servers with identical content, you can send traffic to any server equally. If you need more capacity, you add another server. This addresses the problems of vertical scalability well and even has additional benefits:

Servers stay relatively small because you’re buying more servers, not larger ones. Pricing tends to be contained and predictable.
Scaling out yields immediate benefits to redundancy. In the web server example, if a server dies, you lose a small fraction of your capacity based on how many servers are in the pool.
The servers are only so large, so you stand a much better chance of efficiently utilizing all of the hardware on each server and making better use of IT resources. One server stuck waiting on data won’t impact its neighbors.
With a large number of identical servers, it’s easier to perform reversible maintenance. Individual servers can be pulled out of the pool and upgraded with negligible impact, whether they’re security patches or major functional releases. Once verified working, they can be moved back into the pool and the process repeated. If anything goes awry, only a small portion of the total capacity is impacted while the server is restored to its previous state.

So, what’s the catch? In short, complexity.

More of Everything

More servers means more supporting infrastructure, such as switches, hardware load balancers, or shared SAN storage. Every server has to connect to the network at a minimum, which can drive you to larger or more switches. Dedicated load balancers may be required—devices that can distribute traffic across a group of servers in an intelligent manner. Servers may all need to access the same storage, which means moving from local disks to a shared NAS or SAN, as well as the high-performance interconnect to support it.

Managing Complexity

Managing a large number of servers can be very difficult. More servers means more operating system images, each of which require patches and security audits. Content needs to be synced between systems. Updates have to be coordinated. Typically, a configuration management system (CMS) is required to maintain all servers in lockstep.

Monitoring Everything

Next is monitoring everything. Health checks, security audits, reports—everything needs to be automated and centralized to keep up with the ever-increasing server footprint. And if you’re scaling out in a massive sense, you’ll also need to automate server provisioning and build-out. Creating a patched, pre-configured and hardened server should be a push-button affair.

Application Architecture

Not every application or system is designed to scale out. Issues of data storage, synchronization, and inter-application communication are critical to resolve.

To scale out, each server in the pool needs to be interchangeable. Another way of putting this is servers need to be “stateless”, meaning no unique data is kept on the server. For instance, an application server may be involved in a transaction, but once that transaction is complete the details are logged elsewhere – typically to a database server.

For servers that must maintain state—database servers, for instance—scaling out requires they keep this state in sync amongst themselves. This can be straightforward or fiendishly complex depending on the nature of the state and software involved. For this reason, some systems may still need to scale up despite the benefits of scaling out.

Summary

In many ways, it’s harder to scale from 2 to 20 servers than it is to scale from 20 to 200. There’s a lot of groundwork to lay to make that initial jump:

Build out the underlying infrastructure to support a large number of servers, including the network, load balancing, and storage architecture.
Implement robust configuration management systems, monitoring and reporting, and automated provisioning to allow maintaining a large number of discrete systems.
Applications may need to be rearchitected, factoring servers into those that must maintain state and those that can become stateless and adding synchronization or otherwise altering how systems interact.

The payoff is worth it. Passing these challenges not only leaves you with a more robust and scalable infrastructure today, but also gives you a blueprint for further scaling and expansion.