Music, often paired with elements of singing, is an important part of broadcast content, be it in TV-dramas or concert live streams. In many cases, music and dialogue occur at the same time – just like in real life. And this specific situation is one of dialogue enhancement’s greatest challenges. Popular technologies struggle to detect singing in a film’s music and omit enhancing it. In case of music, this often leads to the unwanted enhancement of singing and attenuation of other musical elements. It can even result in unwanted artifacts, decreased music quality, and reduced intelligibility of dialogue and singing.
Fraunhofer IIS has now solved this issue with the latest feature of MPEG-H Dialog+. It is probably the first technology that prevents the automatic application of dialogue enhancement to singing. This enables retaining the original quality of music sequences while allowing the customization of dialogue passages. The result: consistently enhanced dialogues while maintaining the high sound quality of the background singing.
Public broadcaster ARTE has been relying on MPEG-H Dialog+ and Fraunhofer IIS implementation support for its streaming service arte.tv for quite some time. The audience can switch between the original production and a version with “Klare Sprache” in German or “Confort Audio” in French content. By adopting the new feature, which is now available on arte.tv, ARTE is now taking another important step toward accessible streaming. The German public broadcasters ARD also uses MPEG-H Dialog+ in its “Mediathek” VoD platform.
“MPEG-H Dialog+ is the perfect solution for content providers like broadcasters, to enhance audio material when only the final audio mix is available,” says Marc Gayer, Head of the Audio and Media Technologies’ Business Department at Fraunhofer IIS. “Audiences can enjoy their preferred version by switching between the original mix and the dialogue enhanced version.”
“With MPEG-H Dialog+ we can now deliver the best possible audio and video content to our viewers,” says Kemal Görgülütz, CTO of ARTE. “With the option to select “Confort Audio”, users are now able to choose their preferred audio version. This improves accessibility and delivers custom streaming experiences.”
About MPEG-H Dialog+
MPEG-H Dialog+ is based on artificial intelligence and uses a Deep Neural Network to automatically separate speech from the background (music, effects, ambiences) of a final audio mix. The background is attenuated whenever speech is present, and it automatically remixes the content to a new, dialogue-enhanced version. MPEG-H Dialog+ offers an alternative to the original audio mix that viewers can select if they want to enhance the dialogue, giving people with hearing disabilities, for instance, easier-to-understand options.