Data /

Maestro Data Lake

Maestro Data Lake Solution

What is Data Lake?

Think of Maestro as being the gas station (Data Lake), where you fuel your car (Data Products) – different types of fuel are being offered, but it’s all from the same place. The Data Lake is your large pool of enterprise data, fuelling your data products with data from various source systems.

The Data Lake offers:

  • Large amount of structured, semi-structured, and unstructured data.
  • Configured, semi-configured, and unconfigured data.
  • Data in its native format with no fixed limits on account size or file, offering high data quantity to increase analytic performance and native integration.
  • Unique identifier and metadata information to every data element

Why Data Lake?

The main objective of building a data lake is to offer multiple data sources within the same space, allowing our users to get an unrefined view of the enterprise data pool at Maersk.

How Can I Started?

Why use Data Lake?

  • Easy storing! With the onset of storage engines, there is no need to model data into an enterprise-wide schema, why storing disparate information has become easy.
  • Increased quality! With the increase in data volume, data quality, and metadata, the quality of analysis is increased.
  • Agility! There is no data silo structure and tedious data discovery, when the Data Lake gives 360 degrees view of our data.

What types of data can you find in the Data Lake?

Raw

Stores copy of source system

  • Image
    Engineers
  • Image
    Scientists

Cleansed

Data standardized, remove duplicates and compressed for storage and optimized read.

  • Users
    Engineers
  • Users
    Scientists
  • Users
    Analysts

Ontology

Certified and Integrated business data objects to self-service or building products. Example Shipment, Container, Cargo etc

  • Users
    Engineers
  • Users
    Scientists
  • Users
    Analysts
  • Users
    Business

Publish

Product specific data model for reporting or building app.

  • Users
    Engineers
  • Users
    Scientists
  • Users
    Analysts
  • Users
    Business

Maestro Data Lake Components

Image

Data Ingestion

Data Ingestion allows you to load data to the Data Lake from different data sources.

Ingestion types:

  • Batch
  • Real-time
  • One-time load
Image

Data Lineage

Data lineage allows you to see the data origin and track its usage, which is crucial for mitigating any errors that may occur.

Image

Data Governance & Audit

Managing availability, usability, security, and integrity of the data, as well as any associated risk and compliance.

Image

Metadata

Maestro metadata is information about data stored in Maestro data lake.