Cheat Sheet · Data & Analytics

Google Cloud Professional Data Engineer (PDE) Cheat Sheet

expert

A free Google Cloud Professional Data Engineer (PDE) cheat sheet: the five domains, service-selection rules, BigQuery tuning, pipelines and governance for revision.

By The Exam Atlas Editorial Team · Verified 2026-06-06

A final-revision summary for the Professional Data Engineer exam. Study aid only - no notes in the proctored exam.

The five domains (official weights)

#DomainWeight
1Designing data processing systems22%
2Ingesting and processing the data25%
3Storing the data20%
4Preparing and using data for analysis15%
5Maintaining and automating data workloads18%

The data flow (left to right)

Ingest (Pub/Sub, batch loads) -> Process (Dataflow / Dataproc) -> Store (BigQuery, Bigtable, Cloud Storage) -> Analyse & govern (Dataplex, BI) -> Operate (Composer, Monitoring).

Pick the storage service

NeedUse
SQL analytics over large datasetsBigQuery
High-throughput, low-latency key-based NoSQL (time-series, IoT)Bigtable
Objects, files, data-lake landing zoneCloud Storage
Transactional relational app DBCloud SQL
Global, horizontally scaling relational DBSpanner

Pick the processing service

NeedUse
Serverless batch or streaming pipelinesDataflow (Apache Beam)
Existing Hadoop/Spark or open-source jobsDataproc
Ingest and decouple event streamsPub/Sub
SQL-based ELT transformations in BigQueryDataform
Orchestrate and schedule pipelinesCloud Composer (Airflow)

BigQuery cost and performance

LeverIdea
PartitioningPrune by date/range so queries scan fewer rows
ClusteringSort within partitions to cut bytes scanned
Bytes scannedDrives on-demand cost - reduce it, reduce the bill
Slots / reservationsCapacity-based pricing as an alternative to on-demand
Materialised viewPrecomputed result for frequent, repeated queries

Streaming vs batch

TermIdea
StreamingContinuous processing as events arrive (Pub/Sub + Dataflow)
BatchScheduled bulk loads and transforms
WindowingGrouping streaming data into time windows for aggregation

Governance and operations

TermIdea
IAMRoles and least-privilege access for users and service accounts
DataplexOrganise, govern and discover data across lakes and warehouses
Cloud Monitoring / LoggingMetrics, alerts and logs to operate pipelines
Service accountIdentity pipelines use to authenticate to GCP

Exam facts at a glance

ItemValue
Duration120 minutes (2 hours)
Questions40-50 (per the official exam guide)
Passing scoreNot published by Google (pass/fail)
FormatMultiple choice and multiple select; online-proctored or test centre
FeeUS$200 + tax (recertification US$100 - confirm)
Validity2 years

BigQuery vs Bigtable - the one to get right

BigQuery is the serverless warehouse for SQL analytics on large datasets; Bigtable is wide-column NoSQL for high-throughput, low-latency key-based access such as time-series or IoT. The exam tests when to use each, not depth in one. If the scenario is “analyse with SQL,” think BigQuery; if it is “millions of reads/writes by key with low latency,” think Bigtable.

FAQ

Can I use notes in the Professional Data Engineer exam?
No. It is proctored, online or at a test centre. Use this for final revision before exam day only.

Sources