- Natality Streaming
Creating pipeline with Dataflow using the Natality dataset.
We'll cover the following
PubSub can be used to provide data sources and data sinks within a Dataflow pipeline, where a consumer is a data source and a publisher is a data sink.
Example
We’ll reuse the Natality dataset to create a pipeline with Dataflow, but for the streaming version, we’ll use a PubSub consumer as the input data source rather than a BigQuery result set.
Defining functions
For the output, we’ll publish predictions to Datastore and reuse the published DoFn
from the previous chapter.
Get hands-on with 1200+ tech skills courses.