Hydrogen Market Estimated to Expand at a Robust CAGR By 2032

The hydrogen market is a rapidly growing segment of the energy industry, driven by the increasing demand for clean and renewable energy sources to address climate change and reduce carbon emissions…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Google Cloud Professional Data Engineer Certification

Experience from August 2020

There are many articles on WHY and WHAT, so I will just explain my motivation as perhaps it will resonate with someone.

There is an alternative angle, as well, if you come from Cloud Architect or DevOps background. A lot of ML/AI logic is now getting abstracted by generic cloud providers AI offerings (e.g. GCP AutoML and Computer Vision and NLP APIs) and are rapidly becoming a commodity. Having good data set now is the main problem, not creating the model (well, to be honest it was always like this). For folks with strong DevOps or architectural experience Data Engineering may be an opportunity to move into Data Science roles by adding some knowledge around AI/ML to the mix.

Let’s talk about the value of experience. The official exam guide recommends “3+ years of industry experience including 1+ years designing and managing solutions using GCP”.

If I use the formula stated above where Data Engineering is a mix of Cloud Architecture, DevOps and ML/AI then my only direct relevant experience was 6 years of ML practice and training, on top of PhD involving good deal of regressions and statistics 15 years ago. You can expect exam to have 5–8 questions on ML/AI aspects, but generally they won’t be too deep. As a result, if you come from ML background you can expect up to 15–20% of exam to be easy. This being said — I still encourage ML practitioners not to take it for granted and have a deep look at GCP AI/ML products to understand specifications and use cases.

On Architecture side — I worked closely with several cloud teams developing on GCP over the last few years. Through daily interactions I picked a sense of patterns and anti-patterns. After some time you realize that data pipelines are not that diverse, in fact there are only a few recommended patterns, so if you have seen a couple of working systems on GCP you should not have problems with building a right mental model.

Finally, the DevOps aspect was probably the most challenging for me, IAM, deployment, Logging and Monitoring aspects — I had to spend quite some time on it.

So, key takeaway is to leverage your background. Keep the three areas above in mind, know your strengths and weaknesses and create a study plan correspondingly.

Courses may provide a foundation, but don’t expect them to be sufficient alone to pass the exam. I think at best you can become 60–70% ready for the exam after courses alone.

Coursera Data Engineering Specialization

Overall I found it to focus too much on general patterns. It won’t give you the level of depth you will need for exam. Also when I took it — it was quite outdated, which, by the way, is likely a problem for all courses at the moment. Coursera partnered with QwikLabs which, if you are not familiar, provides a controlled environment to play with cloud platforms and also have the built-in grader. These labs are a great way (in theory) to get practical knowledge, but I found most of the labs to be fairly basic, you don’t have opportunity to experiment, you are expected to be a monkey following the script (and it does not like if you deviate from it) and labs are full of bugs. On three occasions I had major issues with labs. On one occasion — I waited for more than a week for QwikLabs to fix the lab grader to pass the course and move on and everyone had the same issue.

One main (only?) highlight in my opinion is the first course in the specialization, which is on Machine Learning. It was a great overview and even after a dozen of ML courses on multiple MOOC platforms I found some of their explanations very intuitive, succinct and fit to the exam scope. So definitely recommend it if you are relatively new to ML and don’t want to go through longer introductory courses like Andrew Ng’s, for instance.

Linux Academy (now merged with A Cloud Guru)

I saw someone recommending this platform and I wish I saw it before spending time (and money) on Coursera specialization. If you need to take one course — I highly recommend this platform. Matthew Ulasien did a fantastic job as instructor, achieving very high information density in the course. It covers more products and has deeper depth than Coursera. It is a bit more up-to-date than Coursera. Labs are less fancy than QwikLabs (still a monkey script), but they actually work and give you exposure to a wide variety of products.

Other benefits of this platform include: the full-duration 2hr exam with 50 questions (I’ll talk about practice exams in next section), 100+ pages Data Dossier PDF with all notes which you can download (the platform has an interactive version as long as you are an active subscriber) and a dozen of Anki/flashcards collections, created by users (need to be an active subscriber as well).

Linux Academy is $50/month. This specialization has about 12hrs of videos, split by a Product (e.g. BigQuery, BigTable) or a concept (Machine Learning / AI). Overall, these were the money well-spent and if I decide to continue with cloud certifications I’ll likely stick to this platform.

I’ll list some resources which I used after courses to finish my preparation.

Cheat Sheets / Linux Academy Data Dossier

Linux Academy Flashcards / Anki cards

This is one of the most useful features of Linux Academy. There are about 6–8 decks available, some having up to 300 cards. You can rank them by popularity, the top 2 or 3 are definitely worth studying. I had a very busy schedule during preparation and several multi-day pauses, so these cards helped me to bring memories back. They have the right level of details, but won’t cover all products. Be also aware that quite a few of them are outdated (e.g. BigQuery allowing Table-level permissions or Bigtable not requiring three nodes for Production).

Practice Exams

As such, I suggest to use Practice Exams strategically. In a way they should become your training-validation-test data sets. I had Coursera exam first, then I didn’t take Linux Academy exam immediately after I finished the course. I studied for several days and only then took their exam. Finally, I left Google practice exam for the end and took it few days before real exam to confirm that I can handle unseen questions with the good level of comfort.

Syllabus

Going through a detailed syllabus is a also great way to assess your readiness. There are several extended versions available, which I won’t recreate. Here is the good example. The idea is to go over every item and assess your knowledge of use cases, limitations and key specifications.

GCP Product How-To Guides

Well, this is a trivial advice, but you’ll actually need to study official product pages. Read How-To guides. Pay attention to Beta announcements and General Availability announcements. Pay attention to product name changes as well. Pay attention to quota and configuration changes.

Due to certification terms and conditions I cannot talk too much about the exam or provide a “brain dump”. Overall I read some people estimating exam to be 20% more complex than training exams, which I agree.

Some tips:

Good luck!

Add a comment

Related posts:

Bicycles Accident Analysis

This report will try to analyze the best location and time for cycling activities. Specifically, this project will target stakeholders interested in cycling activities such as individual cyclists…

Can You Benefit From a Private Pilates Class?

There are many benefits of taking a Pilates class that you are well aware of. However, have you ever thought about whether you stand to gain anything by taking a private Pilates class? Correct form…

Grandma Bea

The influence of a grandparent is priceless. They are our first babysitters, our second teachers, and the ones who spoil us when our parents say no. They are also arbiters of the past, passing down…