Geo-Distributed Data Lakes Explained

Geo-Distributed Data Lake is quite the mouthful. It’s a pretty interesting topic and I think you will agree after finishing this breakdown. There is a lot to say about how awesome it is to combine the flexibility of a data lake with the power of a distributed architecture, but I’ll get more into the benefits of both as a joint solution later. To start, I want to look at geo-distributed data lakes in two parts before we marry them together, for my non-developer brain that made the most sense! No time to waste, let’s kick things off with the one and only… data lakes.

It’s a Data LAKE, Not Warehouse!

It shouldn’t be a shock to the system to point out that we are living in a data-driven world going into 2021. Because of this, 'data lakes' are a fitting term for the amount of data companies are collecting. In my opinion, we could probably start calling them data oceans, expansive and seemingly never-ending. So what is a data lake exactly?