A Data Lake is a storage repository that can store large
amount of structured, semi-structured, and unstructured
data.
It is a place to store every type of data in its native
format with no fixed limits on account size or file.
It offers high data quantity to increase analytic
performance and native integration.
Data Lake is like a large container which is very similar
to real lake and rivers.
Just like in a lake you have multiple tributaries coming
in, a data lake has structured data, unstructured data,
machine to machine, logs flowing through in real-time .
Unlike a hierarchal Data Warehouse where data is stored in
Files and Folder, Data lake has a flat architecture.
Every data elements in a Data Lake is given a unique
identifier and tagged with a set of metadata information.