The Professional Data Engineer exam is organised into five domains. This is a plain-English summary with the official Google Cloud weightings; the Google Cloud exam guide is authoritative.
| # | Domain | Official weight |
|---|---|---|
| 1 | Designing data processing systems | 22% |
| 2 | Ingesting and processing the data | 25% |
| 3 | Storing the data | 20% |
| 4 | Preparing and using data for analysis | 15% |
| 5 | Maintaining and automating data workloads | 18% |
1 - Designing data processing systems (22%)
Select storage and processing services for the requirements, and design for reliability, security, compliance, flexibility, portability and cost. This is the “which service, and why” backbone: knowing the boundaries between BigQuery, Bigtable, Cloud Storage, Cloud SQL/Spanner, Dataflow and Dataproc.
2 - Ingesting and processing the data (25%)
Plan and build batch and streaming pipelines with Dataflow (Apache Beam), ingest streams through Pub/Sub, run Hadoop/Spark on Dataproc when needed, and orchestrate with Cloud Composer (Airflow). Includes streaming versus batch and windowing.
3 - Storing the data (20%)
Select and tune storage systems - BigQuery, Bigtable, Cloud Storage and others - and plan data warehouses and data lakes. Includes partitioning and clustering in BigQuery, row-key design in Bigtable, and storage classes and lifecycle in Cloud Storage.
4 - Preparing and using data for analysis (15%)
Prepare data for visualisation and sharing, enable secure analytics access, and govern data with discovery and quality tooling such as Dataplex. The smallest domain, connecting pipelines to the people who consume the data.
5 - Maintaining and automating data workloads (18%)
Optimise resources, monitor, log and troubleshoot pipelines, and automate repeated workloads for resilience. Includes Cloud Monitoring and Logging and scheduling with Cloud Composer.