Online vs. Batch Deployment

Online deployment	Batch deployment
Used when we want the model to return the results immediately.	The model doesn’t need to be real-time. We invoke the model offline to generate the results.
An example of online deployment is search results and real-time prediction.	An example use case is processing and indexing documents. We can generate the recommendations offline and leverage the output in the online service.
In the online deployment mode, we need the service to be very quick and have less latency.	Batch deployment is slightly delayed. It can take hours or days to complete.
A lightweight database retrieval system and lightweight model are preferred in such cases.	This mode is used when we have to process a huge amount of data, requiring a lot of processing time.

Endpoints and Deployment