Can Artificial Intelligence Perform Sound Mixing Better Than a Human?

Can Artificial Intelligence Perform Sound Mixing Better Than a Human?

In the latest episode of “Film Science”, syrup laboratory shows the prowess of artificial intelligence (AI) in sound mixing. Along with the human vs. AI component, Syrp Lab’s new video explains how video editors can achieve better sound with sound mixing, whether or not they use AI-powered tools.

Before seeing what AI sound mixing technology can achieve, it is important to discuss the fundamentals of sound. Broadly speaking, a sound is a vibration that travels as energy, or an acoustic wave, through a transmitting medium, such as air. The human ear and brain work together to convert sound waves that enter the ear into electrical impulses that can be interpreted as intelligible sound.

Sound is measured by amplitude and frequency. Amplitude is the height of the sound wave or the loudness of a sound. Sound with greater amplitude is louder. Frequency, sometimes called pitch, is the number of times a sound pressure wave repeats per second. The more it repeats, the higher its frequency and pitch. Amplitude is measured in decibels (dB) and frequency in hertz (Hz).

Sound perception chart
Sound is measured by amplitude and frequency. Amplitude is the height of sound waves, or volume. Frequency is the number of times a sound pressure wave repeats per second, or its pitch. A kick drum has a lower frequency than the highest note on a piano (C8), for example, so the sound pressure wave of the drum repeats fewer times per second than the C8 note.

Sound mixing involves modifying parts of an audio recording to improve the overall quality. This can include changing the amplitude of specific sounds to expand or reduce the audio dynamic range. Rather, it may involve adjusting frequencies to change the pitch of different audio sources. As with editing the basic parameters of a photo, such as exposure and color balance, mixing sound can be motivated by technical reasons, such as making a person easier to hear and understand, and artistic, like conveying different emotions by altering a soundscape.

Syrp Lab appealed to Alex Knickerbockera professional recorder, sound engineer and audio mixer, to explain how to use sound mixing to adjust amplitude and frequency to produce better quality sound.

“In terms of mixing specifically, it’s about taking really well-done source material and balancing it out to accentuate its cinematic feel and calming feel,” Knickerbocker explains.

Although mixing sound for YouTube differs from Knickerbocker’s work for major movie studios, the same basic principles apply.

“You want to record good, clean dialogue. Once you’ve got the dialogue in, it’s about making it as listenable as possible,” he adds.

This includes removing clicks, pops, and other distracting noises. It also includes the use of equalization (EQ) to alter the shape of sound (amplitude and frequency) to make voices easier to hear and understand and more enjoyable to listen to.

When editing a video that includes dialogue and a background music track, a common scenario for many content creators, a good sound mix is ​​not just about adjusting the volume (amplitude) of the music up to make it easy to hear the dialogue; this involves altering the frequency of each audio track to ensure they emphasize each other, rather than competing for the viewer’s ear.

This ties into the three basic stages of typical audio mixing, as Syrp Lab describes. The first step is to use EQ on the mic, making sure there are no harsh frequencies that are unpleasant to listen to. Next is compression, which evens out the overall amplitude of the audio to avoid particularly quiet or loud sounds. The last step is to perform an equalization on the background sounds to eliminate competing frequencies.

Audio mixing steps

However, perhaps with AI sound mixing tools, such as the AI Dialog Leveler and AI Voice Isolation released in DaVinci Resolve 18.1, manual mixing skills are unnecessary. Syrp Lab tests this by comparing three hand-mixed audio samples to DaVinci Resolve and Adobe Audition’s AI sound mixing tools.

As it turns out, according to the judges (Syrp Lab employees), DaVinci Resolve’s AI tools perform exceptionally well when isolating dialogue in noisy environments. However, there’s still a subjective element to sound quality that the AI ​​can’t handle – at least not yet.

Knickerbocker says the best use case for AI for audio mixing now is to clean up audio. This is especially true when you have poorly recorded audio that makes it hard to hear someone speaking.

“Artificial intelligence tools can help people who really don’t know anything about audio restoration get something passable out of something that was rubbish,” says Knickerbocker.

Picture credits: syrup laboratory

Leave a Comment

Your email address will not be published. Required fields are marked *