Massively Scalable Applications

Scaling Applications is different from Massively Scaling Applications. Over the past few years, TechFerry has helped many companies scale their web based products and applications, be it high concurrency, high volume or high velocity - ranging from massive RDBMS to Big Data. This article is our attempt to pen down our experience in application scalability in general, with a focus on Massively Scalable Applications.

We will cover the following topics in this article:

Scalable vs Massively Scalable

Scalability is ability of a system, network, or process to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth. Scalability in the context of web based applications, has been to deal with tens or hundreds of thousands of simultaneous requests, performing thousands of concurrent tasks or transactions.

Massively Scalable Applications, on the other hand, are highly concurrent (millions of transactions in a second), are capable of handling twitter kind of load (millions of tweets pouring in every second). To quantify, any system that scales beyond 1 Million transactions per second is Massively Scalable.

Benchmark: 1 Million TRX per second
  • 1 Million Requests per second
  • 1 Million Messages per second
  • 1 Million DB Transactions per second
To put things into perspective,

1 Million/sec = 1 Billion TRX in 17 minutes = 86.4 Billion TRX a day

Scale out or Scale up?

To achieve, such a level of massive scalability, we have two choices. Scale out (Horizontal Scaling) or scale up (Vertical Scaling).

Scale out

Add more machines to the cluster.
Let us assume that for a particular application, 1 CPU core serves upto 1000 requests per second.
1 CPU Core = 1000 requests/sec
To massively scale (1 Million request/second), we need 1000 cores.
50 machines 20 cores each.
Good idea or stupid idea? What about the costs?

Scale up

Scale each machine to take on more load.
Can one machine scale to a million transactions per second?
The Answer is YES.
Our commodity hardware is very powerful.
What is the bottleneck then?
What do we do to save tons of money being wasted in scaling out?

Computing Spectrum

Let us review the computing spectrum to understand different archtectural approaches people are taking to architect massively scalable applications.

Distributed Computing

  • Distribute load on multiple machines.
  • Make sure there are no bottlenecks or single point of failures.
  • Can we achieve End to End Distribution, from messaging to processing to databases?

Concurrent Programming

  • One CPU core currently handles 1000 trx/sec.
  • Can one core handle 1000 trx in a millisecond instead? That is 1M trx/sec.
  • Can we remove context switching overheads and synchronous, I/O idling?

Parallel Programming

  • Throw more CPU cores for different tasks.

Symmetric Multi Processing

  • A single problem or a single task (eg. a DB query), it takes 2 milliseconds on a core.
  • Can I use two cores and complete this single task in 1 ms?

Distributed Computing

Distribute workload between two or more computing devices or machines connected by some type of network. For example, clustered architecture with multiple machines.

However, in real life web applications, we need to distribute workload on
  • application servers,
  • database servers,
  • perform real-time computations or analytics.

End to End Distributed Computing

The challenge in distributed computing for web applications is to achieve end to end distribution that includes Distributed Storage, Distributed Messaging, and Distributed Analytics (Real Time and Batch).

Traditional vs New Approach

Spot the Bottleneck node / single point of failure in Traditional vs New approach to distributed computing.

Traditional: Load Balancer (L), Master DB (M) | New: ??

Distributed Computing - Tools

Distributed Messaging

Apache Kafka, RabbitMQ, Apache ActiveMQ
A detailed comparison from Linked on these distribued technologies can be found in

Distributed Analytics

Apache Storm (Real Time), Apache Spark (Batch)

Distributed Storage


Distributed Computing - Use Cases

A couple use cases for end to end distributed computing.

Concurrent Programming

Concurrent Programming is a form of computing in which several computations are executing during overlapping time periods - concurrently - instead of sequentially.
Software code that facilitates the performance of multiple computing tasks at the same time.

Architectural Concepts

  • Events, Threads or Actors?
  • Asynchronous Programming
  • Functional Programming

Events vs Threads, Actors

TechFerry Innovation Labs conducted an independent study of performance comparison of multi-threaded synchronous technology using Spring/Hibernate, vs event based, single process, asynchronous technology using NodeJS.


The report is available at NodeJS vs J2EE - A performance comparison study.

Asynchronous Programming

Similar to end to end distributed computing, end to end Asysnchronous programming has its own merits in building massively scalable applications. Aysnchronous programming is not just a server side thing but can also be achieved at database and UI layers too. The adjacent image shows how asynchronous progamming with non-blocking callbacks can be implemented in web applications.

The power of asynchronous programming and its ability to handle concurrent requests is demonstrated in our report on NodeJS vs J2EE - A performance comparison study.

It may not be required to implement an end to end asynchronous programming in most use-cases. You may just require server side asynchronous programming. In some cases you may just need asynchronous functionality at database or UI layers too.

End to End Asynchronous Programming

Functional Programming

A programming paradigm, a style of building the structure and elements of computer programs, that treats computation as the evaluation of mathematical functions and avoids changing-state and mutable data.
  • Routines can easily be moved to a different CPU core.
  • Scala/Akka Actors

Symmetric Multi Processing

Symmetric Multi Processing (SMP) is the processing of programs by multiple processors that share a common operating system and memory.
  • The processors share memory and the I/O bus or data path.
  • A single copy of the operating system is in charge of all the processors.

Asymmetric vs Symmetric

Asymmetric Multiprocessing

  • The different CPU take on different job

Symmetric Multi Processing (SMP)

  • All CPU run in parallel, doing the same job
  • CPUs share the same memory

Innovation Labs @ TechFerry

At TechFerry Innovation Labs, we are conducting cutting edge research on building massively scalable applications. Areas covered include concurrent programming, end to end distributed computing and asynchronous programming, symmetric multi-processing, and Big Data. We have scaled one single cloud machine to process up to 1 Million DB transactions in a second.


  1. A detailed comparison from LinkedIn on Distributed Messaging Frameworks
  2. NodeJS vs J2EE - A performance comparison study