Skip to Primary Menu Skip to Utility Menu Skip to Main Content Skip to Footer
Noname Security enters into agreement to be acquired by Akamai
Learn more
Noname Security Logo
Horizontal vs. Vertical Scaling: Which Is Best for APIs?

Horizontal vs. Vertical Scaling: Which Is Best for APIs?

Harold Bell
Share this article

Key Takeaways

Horizontal scaling adds more machines to a system, distributing the load. Vertical scaling upgrades a single machine’s CPU, RAM, and storage. The choice depends on system architecture and application needs.

Demand for an application programming interface (API) is usually not static. It will go up and down over time. In certain cases, such as holiday season spikes in traffic, an API might need to handle a significantly higher volume of requests than it does at other times. To meet such an increase in traffic, it is necessary to scale up the API. System admins have two choices when it comes to scaling an API. They can go with horizontal scaling, which refers to adding more API instances to a cluster, or vertical scaling, which means adding to the computing capacity of the machine that supports the API. This article explores the differences between the two and which one is best for API scalability.

Horizontal scaling

Horizontal scaling, also known as “scaling out,” is the process of deploying additional virtual machines (VMs) so there will be more API capacity to handle an increased load. (Shrinking capacity is known as “scaling in.”) As more capacity is needed, system admins can add more VMs to the cluster. Specialized resource management software is required, however, to manage the load of API calls and route them to the right VM instance in the cluster and maintain balance.

Vertical scaling

Vertical scaling is the process of adding resources to a single node. It stands in contrast to horizontal scaling, which adds nodes. Vertical scaling, also called scaling up or scaling down, means adding resources such as central processing unit (CPU) capacity, memory, or storage to a server. In the case of APIs, vertical scaling usually involves adding computing capacity to the VM that hosts the API.

For example, if an API is hosted on a VM that’s been allocated one CPU core and 512 megabytes of random access memory (RAM), then scaling that API up could mean doubling the core count and RAM. The API would then have two dedicated CPU processor cores and 1024 megabytes of RAM. With this new configuration, the API should be able to handle roughly double the load—though constraints on network bandwidth, storage speed, and other factors may reduce the impact of vertical scaling. There is also a resource management challenge with vertical scaling, but specialized software can typically handle this issue.

Why APIs statelessness matters for scaling

Which method of scaling is better for APIs? To answer that, it’s first necessary to understand the impact of an API’s stateless architecture. To do that, let’s briefly discuss the difference between stateless and stateful applications. A stateful application stores data from one request to the next. It keeps track of requests and uses the data later. For instance, a stateful app might save information about a client in local memory. This might occur for the purpose of session management or security.

There is nothing wrong with being stateful. Indeed, it may be essential to the desired functioning of the app. However, it is significantly more complicated to execute horizontal scaling for a stateful app. Doing so would require copying stored data from the original version of the app to new instances.

A stateless app or API, in contrast, is one that does not store request data. It does not hold onto session data in memory. Each time a session starts, it’s as if the app is meeting the client for the first time. After the session is over, it’s “goodbye,” with no memory of the session.

Horizontal scaling is possible for a stateless app because it doesn’t matter which VM is responding to API calls. The API client can call on an infinite number of VMs hosting the API, and it will never matter. System admins can add or remove as many VMs as they want without affecting the operation of the API.

Horizontal scaling, the right choice for APIs

Given that APIs are stateless, horizontal scaling emerges as the right way to scale them. Adding more VMs to increase capacity works well with stateless APIs. Admins can create VM clusters that scale out as API demand grows.

Furthermore, while it is possible to scale APIs vertically, horizontal scaling is preferable because the resource allocation issues in vertical scaling make it comparatively harder to do. In contrast, horizontal auto-scaling works easily with APIs. As systems management tools detect a spike in API traffic, they can automatically add VMs to host more instances of the API in a cluster. This is quite a bit more difficult in a vertical scaling situation, as auto-scaling does not work well in vertical scaling.

Demand for APIs will inevitably change over time. It will be necessary to scale API capacity up or down. Horizontal and vertical scaling are both available options for APIs. However, the stateless nature of APIs, coupled with the relative ease of horizontal auto-scaling, favors horizontal scaling as the right approach to scaling up APIs.

Horizontal vs. Vertical Scaling FAQs

How does horizontal scaling affect an API’s performance?

Horizontal scaling has a positive impact on an API’s performance. This approach involves adding more servers to your infrastructure and distributing the load among them. As a result, the system can handle more concurrent requests, leading to improved response times and enhanced overall performance. 

In addition to performance benefits, horizontal scaling contributes to the redundancy and reliability of the system. If one server fails, the load balancer diverts traffic to remaining servers that can seamlessly take over, ensuring continuous operation. This aligns with API security best practices and enhances the system’s resilience against potential failures. Read the API security checklist to protect your API against potential threats.

How does vertical scaling affect an API’s performance?

Vertical scaling involves upgrading the resources of a single server to enhance its performance. This can improve the processing speed for API requests, as a more powerful server can handle a higher load. However, it’s important to note that vertical scaling may have diminishing returns. There are limits to how much you can upgrade a single machine. Beyond a certain point, the performance gains may not be proportional to the investment.

When considering horizontal vs. vertical scaling, weighing the potential benefits against the risks is crucial. Keep in mind things like REST API security implications and ensure that the system remains resilient to failures.

When should I use horizontal scaling vs. vertical scaling?

It’s important to understand the difference between horizontal and vertical scaling to determine when to use each one. Use vertical scaling when: 

  • Your application is in its early stages or has a light load, which a single powerful server can effectively handle. This is especially true when the demand for API calls is low.
  • The application or API doesn’t support distributed computing well. In cases where the architecture or design doesn’t align with horizontal scaling principles, vertical scaling can be a suitable alternative. 

Use horizontal scaling when: 

  • Your application is experiencing growth, and demand is likely to spike. Horizontal scaling allows the infrastructure to adapt to changing loans by adding more servers as needed. 
  • You need to ensure high availability and redundancy. Horizontal scaling provides redundancy, making the system more resilient to failures and ensuring continuous operation even if one or more servers go down. 

Is horizontal scaling or vertical scaling more cost-effective?

The cost-effectiveness of horizontal scaling vs. vertical scaling depends on factors like the application’s specific requirements, deployment scale, and overall demand. Vertical scaling is typically more cost-effective because you only buy one or two new components when you scale. However, it becomes less cost-effective as you approach the limits of the server’s capacity.

Harold Bell

Harold Bell was the Director of Content Marketing at Noname Security. He has over a decade of experience in the IT industry with leading organizations such as Cisco, Nutanix, and Rubrik, and has been featured as an executive ghostwriter in Forbes Technology Council and Hacker News.

All Harold Bell posts
Get Started Now (Tab to skip section.)

Get Started Now

Experience the speed, scale, and security that only Noname can provide. You’ll never look at APIs the same way again.