High-Level Workflow

Learn about the workflow of a typical deepfake setup.

Fake content generation is a complex task consisting of a number of components and steps that help in generating believable content. While this space is seeing quite a lot of research and hacks that improve the overall results, the setup can largely be explained using a few common building blocks. In this lesson, we’ll discuss a common high-level flow that describes how a deepfake setup uses data to train and generate fake content. We'll also touch upon a few common architectures used in a number of works as basic building blocks.

As discussed earlier, a deepfake setup requires a source identity (xsx_s) which drives the target identity (xtx_t) to generate fake content (xgx_g). To understand the high-level flow, we'll continue with this notation, along with the concepts related to the key feature set discussed in the previous section. The steps are as follows:

  • Input processing: The input image (xsx_s or xtx_t) is processed using a face detector that identifies and crops the face. The cropped face is then used to extract intermediate representations or features.

  • Generation: The intermediate representation along with a driving signal (xsx_s or another face) is used to generate a new face.

  • Blending: A blending function then merges the generated face into the target as cleanly as possible.

Respective works employ additional interim or post-processing steps to improve the overall results. The figure below depicts the main steps in detail:

Get hands-on with 1400+ tech skills courses.