Databricks Certified Data Engineer Associate Cheat Sheet
intermediate
A free Databricks Data Engineer Associate cheat sheet: the five topic areas, Delta Lake, Auto Loader, pipelines, Structured Streaming and Unity Catalog.
By The Exam Atlas Editorial Team · Verified 2026-06-06
A final-revision summary for the Databricks Data Engineer Associate. Study aid only - no notes in the proctored exam.
The five topic areas
Databricks does not publish a percentage weight for each area, so cover all five.
Open storage + warehouse reliability and governance in one platform
Cluster / compute
Spark resource that runs code; all-purpose vs job clusters
Medallion
Bronze → silver → gold layering of data
Databricks SQL
SQL interface and warehouses for querying and dashboards
Development and ingestion - Delta Lake
Term
Idea
Delta Lake
ACID, schema enforcement and time travel over files
Managed vs external table
DROP deletes files (managed) vs leaves files (external)
Time travel
Query an earlier table version by version or timestamp
Auto Loader
Incremental file ingestion (cloudFiles) from cloud storage
Processing and transformations
Term
Idea
Spark SQL
SQL over the Spark engine and lakehouse
PySpark
Python API for reading, transforming and writing data
Structured Streaming
Incremental processing of data as it arrives
Batch vs streaming
One-off run vs continuous/incremental processing
Productionizing pipelines
Term
Idea
Lakeflow Declarative Pipelines (DLT)
Declare tables; the platform builds and maintains them
Expectations
Data-quality rules that validate, drop or fail bad rows
Jobs / Workflows
Schedule and orchestrate tasks (notebooks, pipelines)
Multi-task job
A job with dependent tasks run in order
Governance and quality - Unity Catalog
Term
Idea
Unity Catalog
Central governance across workspaces
Namespace
Three levels: catalog.schema.table
Permissions
GRANT/REVOKE access on securable objects
Lineage
Tracked flow of data from source to output
Exam facts at a glance
Item
Value
Duration
90 minutes
Questions
45 scored (plus a few unscored)
Passing score
Not published by Databricks
Exam code
None published by Databricks
Format
Online, proctored; multiple-choice
Fee
US$200 per attempt (+ tax); confirm current pricing
Validity
2 years (retake to recertify)
Databricks vs another platform - pick by your stack
This exam certifies Databricks. The concepts (ingestion, layering, transformations, orchestration, governance) overlap with Snowflake or BigQuery, but the tools do not - Delta Lake, Auto Loader, Lakeflow Declarative Pipelines and Unity Catalog are Databricks. Choose the certification that matches the platform on the job postings you are targeting, not the one that looks easier.
FAQ
Can I use notes in the Databricks Data Engineer Associate exam?
No. It is an online proctored exam, so notes and second screens are not allowed. Use this for final revision before exam day only.