The Lakehouse: An Uplift of Data Warehouse Architecture

In short, the initial architecture of the data warehouse was designed to provide analytical insights by collecting data from various heterogeneous data sources into the centralized repository and acted as a fulcrum for decision support and business intelligence (BI). But it has continued with numerous challenges, like more time consumed on data model designing because it only supports schema-on-write, the inability to store unstructured data, tight integration of computing, and storage into an on-premises appliance, etc.

This article intends to highlight how the architectural pattern is enhanced to transform the traditional data warehouse by rolling over the second-generation platform data lake and eventually turning it into a lakehouse. Although the present data warehouse supports three-tier architecture with an online analytical processing (OLAP) server as the middle tier, it is still a consolidated platform for machine learning and data science with metadata, caching, and indexing layers that are not yet available as a separate tier.