Horizontal vs. Vertical Scaling
Demand for an application programming interface (API) is usually not static. It will go up and down over time. In certain cases, such as holiday season spikes in traffic, an API might need to handle a significantly higher volume of requests than it does at other times. To meet such an increase in traffic, it is necessary to scale up the API. System admins have two choices when it comes to scaling an API. They can go with horizontal scaling, which refers to adding more API instances to a cluster, or vertical scaling, which means adding to the computing capacity of the machine that supports the API. This article explores the differences between the two and which one is best for API scalability.
Horizontal scaling, also known as “scaling out,” is the process of deploying additional virtual machines (VMs) so there will be more API capacity to handle an increased load. (Shrinking capacity is known as “scaling in.”) As more capacity is needed, system admins can add more VMs to the cluster. Specialized resource management software is required, however, to manage the load of API calls and route them to the right VM instance in the cluster and maintain balance.
Vertical scaling is the process of adding resources to a single node. It stands in contrast to horizontal scaling, which adds nodes. Vertical scaling, also called scaling up or scaling down, means adding resources such as central processing unit (CPU) capacity, memory, or storage to a server. In the case of APIs, vertical scaling usually involves adding computing capacity to the VM that hosts the API.
For example, if an API is hosted on a VM that’s been allocated one CPU core and 512 megabytes of random access memory (RAM), then scaling that API up could mean doubling the core count and RAM. The API would then have two dedicated CPU processor cores and 1024 megabytes of RAM. With this new configuration, the API should be able to handle roughly double the load—though constraints on network bandwidth, storage speed, and other factors may reduce the impact of vertical scaling. There is also a resource management challenge with vertical scaling, but specialized software can typically handle this issue.
Why APIs statelessness matters for scaling
Which method of scaling is better for APIs? To answer that, it’s first necessary to understand the impact of an API’s stateless architecture. To do that, let’s briefly discuss the difference between stateless and stateful applications. A stateful application stores data from one request to the next. It keeps track of requests and uses the data later. For instance, a stateful app might save information about a client in local memory. This might occur for the purpose of session management or security.
There is nothing wrong with being stateful. Indeed, it may be essential to the desired functioning of the app. However, it is significantly more complicated to execute horizontal scaling for a stateful app. Doing so would require copying stored data from the original version of the app to new instances.
A stateless app or API, in contrast, is one that does not store request data. It does not hold onto session data in memory. Each time a session starts, it’s as if the app is meeting the client for the first time. After the session is over, it’s “goodbye,” with no memory of the session.
Horizontal scaling is possible for a stateless app because it doesn’t matter which VM is responding to API calls. The API client can call on an infinite number of VMs hosting the API, and it will never matter. System admins can add or remove as many VMs as they want without affecting the operation of the API.
Horizontal scaling, the right choice for APIs
Given that APIs are stateless, horizontal scaling emerges as the right way to scale them. Adding more VMs to increase capacity works well with stateless APIs. Admins can create VM clusters that scale out as API demand grows.
Furthermore, while it is possible to scale APIs vertically, horizontal scaling is preferable because the resource allocation issues in vertical scaling make it comparatively harder to do. In contrast, horizontal auto-scaling works easily with APIs. As systems management tools detect a spike in API traffic, they can automatically add VMs to host more instances of the API in a cluster. This is quite a bit more difficult in a vertical scaling situation, as auto-scaling does not work well in vertical scaling.
Demand for APIs will inevitably change over time. It will be necessary to scale API capacity up or down. Horizontal and vertical scaling are both available options for APIs. However, the stateless nature of APIs, coupled with the relative ease of horizontal auto-scaling, favors horizontal scaling as the right approach to scaling up APIs.