Member-only story

Mastering Your Data with Medallion Architecture: The Three-Layer Design for Data Management

Serigne DIAW
8 min readMay 14, 2023

--

In my previous article, Benchmarking database architectures : Data Warehouse, Data Lake and Data Lakehouse, I compare that three Databases architecture.

Data is the backbone of any organization, and properly organizing and managing it is critical to ensuring its practical use. One way to organize and manage data is to use a Data Lakehouse architecture.

The objective of this article is to focus on the Data Lakehouse architecture in more details through one of its design patterns, Medallion Architecture and to show how it fit to achieve the current state of the art, especially in the context of data processing approaches.

Medallion Architecture is one of Data Lakehouse design patterns. When deployed, it allows for simple data flow through specific Data Lakehouse layers. With each layer, data and its structure is augmented, enhanced, cleaned and aggregated to finally present end-users with high quality data products that may be used for Business Intelligence reporting and Machine Learning.

A medallion architecture consists of three layers: Bronze, Silver and Gold. Data flows from one layer to the next, gradually moving from raw, unstructured data to high-quality, refined data that is ready to be used.

Let’s take a closer look at each layer :

The purpose of the Data Lakehouse architecture is to prepare reliable, flexible data storage, optimized for both storing and processing of high structured data, as well as semi-structured and unstructured data. At the same time it is very cost effective compared to standard Data Warehouses.

The Medallion Architecture describes a series of data layers that denote the quality of data stored in the Lakehouse. This architecture guarantees ACID (Atomicity, Consistency, Isolation, and Durability) as data passes through multiple layers of validations and transformations before being stored in a layout optimized for efficient analytics.

Data Lakehouse architecture is a modern approach to building Data Warehousing systems. It takes the standard Data Warehousing approach and combines it with all the advantages of Data Lake.

--

--

Serigne DIAW
Serigne DIAW

Written by Serigne DIAW

Data Engineer / Data Architect / Data Scientist

No responses yet

Write a response