Cheat Sheet · Data & Analytics

Databricks Certified Data Engineer Associate Cheat Sheet

intermediate

A free Databricks Data Engineer Associate cheat sheet: the five topic areas, Delta Lake, Auto Loader, pipelines, Structured Streaming and Unity Catalog.

By The Exam Atlas Editorial Team · Verified 2026-06-06

A final-revision summary for the Databricks Data Engineer Associate. Study aid only - no notes in the proctored exam.

The five topic areas

Databricks does not publish a percentage weight for each area, so cover all five.

#Topic area
1Databricks Data Intelligence Platform
2Development and Ingestion
3Data Processing and Transformations
4Productionizing Data Pipelines
5Data Governance and Quality

The lakehouse workflow (left to right)

Ingest (Auto Loader) → Bronze (raw Delta) → Silver (clean, Spark SQL/PySpark) → Gold (aggregates) → Orchestrate (Jobs) → Govern (Unity Catalog).

Platform basics

TermIdea
LakehouseOpen storage + warehouse reliability and governance in one platform
Cluster / computeSpark resource that runs code; all-purpose vs job clusters
MedallionBronze → silver → gold layering of data
Databricks SQLSQL interface and warehouses for querying and dashboards

Development and ingestion - Delta Lake

TermIdea
Delta LakeACID, schema enforcement and time travel over files
Managed vs external tableDROP deletes files (managed) vs leaves files (external)
Time travelQuery an earlier table version by version or timestamp
Auto LoaderIncremental file ingestion (cloudFiles) from cloud storage

Processing and transformations

TermIdea
Spark SQLSQL over the Spark engine and lakehouse
PySparkPython API for reading, transforming and writing data
Structured StreamingIncremental processing of data as it arrives
Batch vs streamingOne-off run vs continuous/incremental processing

Productionizing pipelines

TermIdea
Lakeflow Declarative Pipelines (DLT)Declare tables; the platform builds and maintains them
ExpectationsData-quality rules that validate, drop or fail bad rows
Jobs / WorkflowsSchedule and orchestrate tasks (notebooks, pipelines)
Multi-task jobA job with dependent tasks run in order

Governance and quality - Unity Catalog

TermIdea
Unity CatalogCentral governance across workspaces
NamespaceThree levels: catalog.schema.table
PermissionsGRANT/REVOKE access on securable objects
LineageTracked flow of data from source to output

Exam facts at a glance

ItemValue
Duration90 minutes
Questions45 scored (plus a few unscored)
Passing scoreNot published by Databricks
Exam codeNone published by Databricks
FormatOnline, proctored; multiple-choice
FeeUS$200 per attempt (+ tax); confirm current pricing
Validity2 years (retake to recertify)

Databricks vs another platform - pick by your stack

This exam certifies Databricks. The concepts (ingestion, layering, transformations, orchestration, governance) overlap with Snowflake or BigQuery, but the tools do not - Delta Lake, Auto Loader, Lakeflow Declarative Pipelines and Unity Catalog are Databricks. Choose the certification that matches the platform on the job postings you are targeting, not the one that looks easier.

FAQ

Can I use notes in the Databricks Data Engineer Associate exam?
No. It is an online proctored exam, so notes and second screens are not allowed. Use this for final revision before exam day only.

Sources