Pub/Sub in a Nutshell
Intro
This post is part of a series of posts with notes as I’m studying for Google’s Professional Data Engineer Certification.
This particular post covers Pub/Sub in a nutshell.
Disclaimer
Please read this disclaimer.
Pub/Sub
- Fully managed, data ingestion and distro. system
- Async., real-time message bus
- Good soln. for buffering changes for lightly-coupled architectures
- P/S decouples publishers and subscribers
- HIPPA compliant
- Stores data for up to 31 days
- Default is 7 days
- Has topics and subscriptions
- Data is published to a topic
- Subscriptions dictate who recieves content for which topics
- Relationship btw. topics and subscriptions can be one:one or one:many
- Relationship btw. publishers and subscribers and can be one:many, many:one, or many:many
- Multiple subscribers can work on a single subscription
- Guarantees “at least once” delivery
- May send duplicate and/or out-of-order messages
- Two options for delivery:
- Push (from pub/sub to subscriber)
- Pull (from subscriber to pub/sub)
- Sync. - “Give me n messages”
- Async. - Higher throughput - Better for latency-sensitive applications
- Two, primary use-cases:
- Streaming analytics/ingestion of data into analytic systems
- Async. workflows w/ decoupled publishers and subscribers