ML for the GCP PDE Exam

Intro

This post is part of a series of posts with notes as I’m studying for Google’s Professional Data Engineer Certification.

This particular post covers ML topics.

L1 vs. L2 regression:
- L1 estimates the median of the data using absolute value
  - Reduces low-value features
  - Robust to outliers
  - Good when only certain features contribute to success of model
- L2 estimates the mean of the data to avoid overfitting
  - Not recommended for feature selection
  - Good when all features contribute relatively equally to the success of a model
TensorFlow
- Know how TensorFlow can be deployed (also, cost vs. value)
  - CPUs, GPUs, TPUs
BQ ML
- Understand the basic flow when using BQ ML:
Vertex AI
- Create computation graph and training app
- Package app
- Start Vertex AI job to run packaged app