Audio Enhancement

Explore sound enhancement techniques, including source separation, repair, and inpainting.

Broadly speaking, audio enhancement is an umbrella term used by the research community to refer to enhancing the quality of an audio signal. Audio signals are degraded in many forms, be it through interventions from other audio signals or failure in the network that results in lost packets, severe compression, and even material waste associated with the form in which they are stored. Audio enhancement can take forms such as source separation, audio repair, and audio inpainting.

On a daily basis, our ears receive an audio mixture comprised of sounds inside and outside our environment that are able to travel to us. Although our ears receive an audio mixture, we are able to actively ignore other signals and focus on the audio signal that we are interested in. For example, listen carefully to the sounds in your environment that you were not aware of and notice how you can focus on one sound at a time.

Source separation

Machine listening, on the other hand, does not have the explicit ability to separate the target source from the audio mixture. This brings us to one of the first approaches to audio enhancement called source separation.

In source separation, the audio signal is assumed to be an audio mixture that is created by adding multiple audio signals together. In this case, we can extract the target audio by subtracting the noise signal from the mixture because the relationship between the target signal and the noise is additive. The following screenshot produced with the iZotope RX illustrates the waveform of a noise signal at 16 kHz:

Get hands-on with 1400+ tech skills courses.