Maestro - One Platform Endless Solutions

Databricks

Databricks is an industry-leading, cloud-based data engineering tool used for processing and transforming massive quantities of data and exploring the data through BI tools or machine learning models. Provided through Azure Cloud Ecosystem, it can be used for Data wrangling by connecting to Azure Datalake.

Go to Databricks

What to use Dataiku DSS for?

Compute on Massive Data

This Apache-Spark based platform runs a distributed system behind the scenes, meaning the workload is automatically split across various processors and scales up and down on demand.

Azure Databricks

Azure Databricks SQL Analytics provides an easy-to-use platform for analysts who want to run SQL queries on their data lake, create multiple visualization types to explore query results from different perspectives, and build and share dashboards.

Independent workspace for Writing Notebooks

Databricks provides workspaces for collaboration (between data scientists, engineers, and business analysts), deploys production jobs (including the use of a scheduler), backed by version control systems like git and has an optimized Databricks spark compute engine for running. Notebooks can be written with popular languages like Python, Scala, Java, SQL.

Connect to Different Data source

This spark based system can connect to sources including on premise SQL servers, CSVs, and JSONs as well as other data sources include MongoDB, Parquet Files, ORC Files, Avro files, and Couchbase. For more details visit this below link:

Who is Databricks for?

Databricks is a platform provided through Azure cloud. This system is used by Big Data Analysts, Data Scientist, ML Engineers for processing massive data in day to day work. Further, the delta lake by Databricks provides an open format storage layer that delivers reliability, security and performance on Azure data lake — for both streaming and batch operations. With provisioning of SQL Analytics, the Business analysts get a simple experience to run quick ad-hoc queries on their data lake, create multiple visualization types to explore query results from different perspectives, and build and share dashboards.

How Can I Started?

Databricks is provided through either a shared workspace or a dedicated workspace. If you need a dedicated workspace, you need to register a product in Maestro and get the workspace in the registered resource group. Further, for devops pipelines, you could use Maestro Application to set up your Project and Databricks Repository in Azure devops as well as code deployment Pipelines.

Tools /

Databricks

Databricks

What to use Dataiku DSS for?

Compute on Massive Data

Azure Databricks

Independent workspace for Writing Notebooks

Connect to Different Data source

Getting Started

Who is Databricks for?

How Can I Started?