How To Pass Google Cloud Professional Data Engineer Exam Tips and Tricks .

Namrata D
6 min readOct 16, 2020

--

* This article covers the most recent exam syllabus. *

I passed this certification exam with nearly two months of studying: I do not have 3+ years of experience as recommended by Google to pass this exam, and it has been only 4 months since I have started learning cloud . I think the key to my success was just studying the right way for this exam.

I had to learn data engineering skills from scratch. I studied extra hard for this exam, and to some extent I think harder than necessary . I came across the Google Cloud Professional Data Engineer Certification and took several online courses to gain a better insight of the technology as it has in the scope of data analytics.

You don't need to be a IT professional in order to pass the exam , still you do need to know the basics of coding, object-orientated programming, basics of system, product lifecycle management, code design, staging and deployment . There are lot of contents and theoretical knowledge that you need to memorize for the exam . The application of the knowledge is very essential to tackle the question in the certification exam . I took up the following courses for the GCP data engineering exam.

Courses

1. Coursera : Data Engineering with Google Cloud

** Rating (6.5/10) but essential **

Cost : $49 USD/month with 7 days trial

Coursera was the first online course which I took and it is taught by Google professionals . The courses is divided into six short courses which has lots of presentations, hands-on labs and demos. I found this course to be quite advanced for someone without any prior experience . I wasn’t even aware of technologies like Hadoop, map reduce, data pipelines etc. I watched all the videos a couple of times to make myself familiar with the unfamiliar terminologies. Though I highly recommend taking this course, but it is not enough to pass the exam.

2a. Cloud Guru : Google Certified Professional Data Engineer by Matthew Ulasien (Originally on Linux Academy )

**Rating (9/10)**

Cost: 50 USD /month (1 week free trial)

The course has several highlights. They cover each of the topics separately which makes is very easy to understand . Moreover the explanations are very clear and concise . This course comes with the Data Dossier eBook (essentially the collated course materials) which is in-depth lucid chart for all the systems. They have labs, practice quizzes and one final exam. I must say that this is a phenomenal resource for understanding the basic concepts of GCP if you are a beginner . The lectures and the quizzes after each module are also very helpful .

2 B. Cloud Guru : Google Certified Professional Data Engineer by Tim Berry:

**Rating 8.5 /10 **

The next one is Google Certified Professional Data Engineer by Tim Berry . The modules are recently updated and is also very effective for the exam . The material of this course is somewhat different from Matthew Ulasein course on the same platform. It gives key highlights of what is expected in the exam . They also have practice exam, labs and quizzes.

If you are new to GCP these two courses help to strike the balance between breadth and depth and gives a solid understanding of what is expected in the Professional Data Engineer certification exam.

Note : You also take up both courses by paying 50 US $ /month(1 Week free trail). The quizzes and exam were available even after your subscription ends.

3. Machine Learning Crash Course (free)

**Rating 9/10**

This course is Google’s fast-paced, practical introduction to machine learning. I used this course as a quick refresher since I had already covered some of the algorithms and concepts in another certification. It is well structured and gives you a good foundation of machine learning.

I would say these three resources provided me enough content for the exam. Google Cloud Platform gives $300 free credit for anyone who signs up to use its products. Leverage it and feel free to explore .

Disclaimer : Delete the resources after you are done with the labs .

4. Google Documentation: I know it is impossible to read the entire documentation but make sure to find out the answer of any questions in the documentation if you are unsure .

Optional:

Linux Academy SQL Primer . You will also need some SQL knowledge for BigQuery and BigQuery ML.

Important :

The question were focused mainly on the following topics:

BigQuery best practices, https://cloud.google.com/bigquery/docs/best-practices-costs. BigQuery ML, syntax for wildcards, partitioned tables. The partitioned based on ingestion time, timestamp or date. IAM roles and authorized views. https://cloud.google.com/bigquery/docs/access-control. Difference between Viewer credentials and Owner credentials. BigQuery Data Transfer Service is also very important. Big Query and Data Studio along with caching/pre-fetch cache. How you connect Data Studio to storage solutions. Learn the difference between default caching & pre-fetch caching . How to link BigQuery and GCS with permanent tables and temporary tables.https://cloud.google.com/bigquery/external-data-cloud-storage

Difference between Kafka and pub/Sub along with the connectors. https://cloud.google.com/pubsub/docs/overview

Dataproc: . It is a best practice to use Google Cloud Storage instead of using HDFS. You can delete the instance after data processing. You should be aware when to use HDFS instead of GCS. https://cloud.google.com/dataproc/docs/concepts/dataproc-hdfs

Pipelines with Cloud Dataflow : Learn fundamental of date pipeline like PCollection, PTrasform, ParDo , CoGroupByKey, GroupByKey, Combine, Flatten , Pipeline I/O etc . Expect to see a couple of question on these topics.

Datalab and Dataprep basics which are covered in cloud guru .

NoSQL Data warehouse with Cloud Bigtable and when to use it. Designing the row keys and how to avoid hot spotting (Read the documentation) https://cloud.google.com/bigtable/docs/schema-design

The above topics cover more than 60–70 % of exam questions .

Data security encryption and decryption including CMEK, CSEK and Client-Side Encryption . https://cloud.google.com/kms/docs/

Machine Learning : Different models in ML . How to do feature scaling , overfitting and underfitting . What is regularization ? Difference between testing and training data set . Understand L1/L2 regularization etc.

Cloud Composer DAGS and workflow orchestration.

Difference between Cloud Spanner and Cloud SQL along with read replica and failover replica.

Datastore: backup and migration and Firestore.

Pre-Trained ML Cloud APIs is also very important especially cloud vision API, Text to speech API , Cloud video intelligence API , Dialogue flow etc.

Important from exam prospective:

I can’t stress this enough the practice exams are extremely important. I personally went through the practice questions on Linux Academy, Cloud Guru and Google again and again to make sure I understood the answers. I practiced exams from these resources multiple times till I was able to achieve more than 90 %. I am also attaching the link to various other practice exam which are extremely helpful in understanding the type of questions you will be expecting in the exam.

My experience:

There are many individuals that passed the Professional Data Engineering exam and share their experience online about how to prepare for the exam . I feel only a few come from a non-technical background, since the recommended experience to take the exam is 3 years (industry ) and 1 year (designing and managing solutions) using GCP. Thought I had to study extra hard I could pass the exam in 3 months of preparation. If I can do it so can you.

Google official documentation is bit too lengthy still it is advisable to know best practice of all the products mentioned in the exam guide . The exam is challenging but if you study well you can become certified too.

Last but not the least give the practice exams like you are giving a real certification exam. It helps !

Most important:

Read the official Google exam guide properly to find out the changes in the syllabus.

--

--

Namrata D
Namrata D

Written by Namrata D

AWS Solution Architect Associate, CKA,CKAD,CKS, Terraform & HashiCorp Vault Certified

Responses (1)