Big Data Architecture : Understanding the Lambda Architecture with Detailed Explanation

Serigne DIAW
9 min readMar 12, 2023

There are a lot of different tools for handling massive amounts of data: for Storage, Analysis or Dissemination, for example. But how to put these different tools together to build an architecture that can Scale, be Fault-tolerant and Easily extensible, all without blowing up the costs ?

In this article, I’ll introduce you to a very popular architecture model that can be applied to almost any situation requiring massive data. It’s called Lambda Architecture.

It is a model that will allow you to design an architecture that fits your needs while keeping a modular structure. I will present in detail this generic model as well as concrete technical choices that meet the specifications of the different components.

Lambda architecture definition

The Lambda Architecture is a deployment model for data processing that organisations use to combine a traditional batch pipeline with a fast real-time stream pipeline for data access. It is a common architecture model in IT and development organisations toolkits as businesses strive to become more data-driven and event-driven in the face of massive volumes of rapidly generated data.

The design of a Lambda Architecture is guided by the following constraints:

  • Scaling : the proposed architecture must be able to scale horizontally, i.e. by adding servers. This growth must be done while…

--

--

Serigne DIAW
Serigne DIAW

Written by Serigne DIAW

Data Engineer / Data Architect / Data Scientist

No responses yet