Practice questions · Data & Analytics

Databricks Certified Data Engineer Associate: Practice Questions

intermediate 30 questions

Original practice questions for the Databricks Certified Data Engineer Associate. Each answer is explained, including why each other option is wrong. Filter by topic area or difficulty. These are concept checks - not questions from the certification.

By The Exam Atlas Editorial Team · Verified 2026-06-06 · ~38 min

  1. Databricks Data Intelligence Platform easy

    Which open table format gives Databricks tables ACID transactions, schema enforcement and time travel on top of files in cloud storage?

  2. Databricks Data Intelligence Platform easy

    In the medallion architecture, which layer holds raw data ingested as-is from the source?

  3. Databricks Data Intelligence Platform medium

    A team needs an interactive cluster to develop and test notebook code together during the day. Which compute is the best fit?

  4. Databricks Data Intelligence Platform medium

    Why is the lakehouse described as combining a data lake with a data warehouse?

  5. Databricks Data Intelligence Platform medium

    Databricks SQL is primarily used to:

  6. Databricks Data Intelligence Platform medium

    You query a Delta table 'AS OF' an earlier version to recover data that was overwritten this morning. Which Delta Lake capability is this?

  7. Development and Ingestion medium

    You drop a managed table in Databricks. What happens to the underlying data files?

  8. Development and Ingestion medium

    You need to continuously load only newly arrived files from a cloud storage folder, without reprocessing old ones. Which Databricks feature is built for this?

  9. Development and Ingestion medium

    What is the key difference between a managed table and an external (unmanaged) table?

  10. Development and Ingestion medium

    Which SQL command would you use to load query results into a new managed Delta table from existing data?

  11. Development and Ingestion medium

    When ingesting raw source files into the bronze layer, the usual goal is to:

  12. Development and Ingestion hard

    A CREATE TABLE statement fails because incoming data has a column type that does not match the table definition. Which Delta Lake behaviour caused this?

  13. Data Processing and Transformations easy

    Which language is the Python API for working with Spark DataFrames in Databricks?

  14. Data Processing and Transformations medium

    How does Structured Streaming differ from a standard batch query?

  15. Data Processing and Transformations medium

    You want to combine two DataFrames by matching rows on a shared key column, adding columns from both. In PySpark you would use a:

  16. Data Processing and Transformations medium

    When transforming bronze data into a silver table, a typical step is to:

  17. Data Processing and Transformations hard

    A streaming query that reads from a source and writes to a Delta sink uses a checkpoint location mainly to:

  18. Data Processing and Transformations hard

    In Spark, a transformation such as filter() is described as 'lazy'. This means it:

  19. Productionizing Data Pipelines medium

    Which Databricks feature lets you define target tables and transformations declaratively while the platform manages dependencies and execution?

  20. Productionizing Data Pipelines medium

    In a Lakeflow Declarative Pipeline (DLT), what do 'expectations' do?

  21. Productionizing Data Pipelines easy

    Which tool do you use to schedule and orchestrate several dependent tasks - notebooks, a pipeline and a script - to run in order?

  22. Productionizing Data Pipelines hard

    How does a Lakeflow Declarative Pipeline differ from a Databricks Job?

  23. Productionizing Data Pipelines medium

    You want a production job to run on fresh, automatically terminated compute rather than a shared interactive cluster. You should configure it to use:

  24. Productionizing Data Pipelines hard

    If one task in a multi-task Databricks Job fails, a sensible production practice is to:

  25. Data Governance and Quality easy

    Which Databricks component provides centralised governance - permissions, lineage and a unified namespace - across workspaces?

  26. Data Governance and Quality medium

    What is the correct three-level namespace used to reference an object in Unity Catalog?

  27. Data Governance and Quality medium

    To let an analyst read a specific table but not modify it, which approach fits Unity Catalog?

  28. Data Governance and Quality hard

    Within a Lakeflow Declarative Pipeline, you add a rule that fails the update if a primary-key column contains nulls. This is best described as:

  29. Data Governance and Quality medium

    A regulator asks how a particular gold table was built and which source tables fed it. Which Unity Catalog feature answers this fastest?

  30. Data Governance and Quality hard

    Your organisation wants the same table permissions and names to apply consistently when users work from several different Databricks workspaces. Unity Catalog supports this because it provides:

Practice questions FAQ

Are these real Databricks DE Associate exam questions?
No. These are original study questions written to test understanding. They are not real exam questions, exam dumps, or copied from any provider.
How should I use these practice questions?
Answer each one, read the explanation (including why the wrong options are wrong), and use the per-domain score below to focus your revision on weak areas. Revisit before exam day.
How many questions should I do before the exam?
Enough to score consistently across every domain, alongside full-length practice from official or reputable providers. Understanding why each answer is right matters more than raw volume.
What score means I am ready?
A good signal is consistently scoring around 80% or higher across all domains on questions you have not seen before, and being able to explain why the wrong options are wrong.
Should I use exam dumps?
No. Dumps (real or leaked questions) breach provider policy, can void your certification, and do not build the understanding the exam actually tests.

Sources