Need for containers

When deploying data science models, it’s important to be able to reproduce the same environment used both for training and serving. In Chapter : Models as Web Endpoints, we used the same machine for both environments, and in Chapter 3 : Models as Serverless Functions we used a requirements.txt file to ensure that the serverless ecosystem used for serving the model matched our development environment. Container systems such as Docker provide a tool for building reproducible environments, and they are much lighter weight than alternative approaches such as virtual machines.

Introduction to Building Scalable Model Pipelines

Models as Web Endpoints

Models as Serverless Functions

Create an Echo Function in Lambda

Working with S3 in Lambda

Working with API in Lambda

Containers for Reproducible Models

Working with AWS Container Registry

Workflow Tools for Model Pipelines

PySpark for Batch Pipelines

Cloud Dataflow for Batch Modeling

Streaming Model Workflows

Course Conclusion

Introduction to Containers as Reproducible Models

Need for containers

Isolated environments