ML for the GCP PDE Exam
Intro
This post is part of a series of posts with notes as I’m studying for Google’s Professional Data Engineer Certification.
This particular post covers ML topics.
Disclaimer
Please read this disclaimer.
Key Concepts
- L1 vs. L2 regression:
- L1 estimates the median of the data using absolute value
- Reduces low-value features
- Robust to outliers
- Good when only certain features contribute to success of model
- L2 estimates the mean of the data to avoid overfitting
- Not recommended for feature selection
- Good when all features contribute relatively equally to the success of a model
- L1 estimates the median of the data using absolute value
- TensorFlow
- Know how TensorFlow can be deployed (also, cost vs. value)
- CPUs, GPUs, TPUs
- Know how TensorFlow can be deployed (also, cost vs. value)
- BQ ML
- Understand the basic flow when using BQ ML:
- Vertex AI
- Create computation graph and training app
- Package app
- Start Vertex AI job to run packaged app