How the media industry can take advantage of the AI wave

(Image credit: Getty Images)

Do not adjust your television sets; what you are about to read has the potential to change how you produce and consume content. While almost every industry is talking about the transformative potential of generative AI, it’s perhaps most visually obvious in film and TV. Here, artificial intelligence-generated content (AIGC) is reshaping the way we produce and edit digital content.

A key enabler of this is the cloud, which is the only real way to handle the amount of compute power needed for AI. As we use more and more generative AI models, this will need to significantly increase while latency must be reduced to almost nothing. We’re already on our way towards that future and by 2030, it’s estimated that 90% of content will be generated by AI.

“The media and entertainment industry is a pioneer for innovation and has undergone a digital transformation over the past few years,” said Jamy Lyu, president of Huawei Cloud Media Services.

“The newspaper used to be the main media for most people to obtain information, but then movies and TV sparked a leap in entertainment. People started to spend more time on video and in this era, high-quality content was the key to business success.

“Now users are spending more time on mobile. But, most importantly, people are not just watching, they have also become content creators. So now it is user-generated content that is making the real difference.”

While digital transformation within the media industry is well underway, the next few years will see mass adoption of AIGC for video content. From rapid translations to virtual humans presenting news, what we watch is about to change before our very eyes.

Introducing Huawei Cloud Solutions

Huawei has been at the forefront of much of this technology, developing AIGC applications with partners for a number of years. Much of its work has been about taking popular content from China and making it more accessible to a global audience.

A good example of this is AI video translation, which can reproduce the original sounds and tones of a character from a movie or TV show and replicate them in multiple languages. It does this so quickly that the time needed for translating one piece of work is reduced from months to mere days.

Traditionally voice cloning has worked by recording and labeling hundreds of sentences spoken by an individual to capture the range of sounds made when speaking. The Huawei Cloud Media Model significantly simplifies this process. With its powerful speech generation capability, it can create a lifelike cloned voice of high quality that retains the emotion and nuances of the original recording.

There are many scenarios a cloned voice can come into play, such as chatbots, news reporting, and also for live streaming. Huawei suggests that you can always find a voice that fits your specific needs: Take, for example, filmmakers in China who are embracing the international market. They see more efficient and precise translations of their works as essential to succeeding on the global stage. From dubbing dialogue and the true representation of a character’s voice, Huawei Cloud Media Model’s AI-powered translation is not just about language. It’s also able to overcome one of the biggest challenges in dubbing – a person’s facial expressions and lip movements being out of sync with the dialogue. Huawei’s technology is able to perfectly reproduce these using visual AI technology, including English, French, German, Spanish, and Arabic.

This can be seen in action in the Chinese documentary To the Summit which won the 2019 Laureus Sporting Moment of the Year. The film follows double-leg amputee Xia Boyu on his fourth attempt to reach the top of Mount Everest. His relentless pursuit has been translated into numerous languages – including English – via the Huawei Cloud Pangu Media Model, to make his inspiring story a global one. The model also creates a natural-looking lip movement, taking Boyu’s Chinese dialogue and making it appear as fluent English.

A world of pure AI-imagination

One of the most time-consuming forms of media is animation, but generative AI is taking the artistic process and drastically reducing production time.

Huawei’s AI-to-animation conversion capability, for example, can be applied to short films and even classic movies. The software allows the smallest of character details and motions, such as dancing, can be reproduced in the animation with the production cycle slashed from months to days.

The system is trained on dozens of images of specific aesthetic styles from animation, comics, and games so that it can quickly generate an animated video in a specific style from an original video input.

Currently, animated video generation requires stylizing each frame and then blending those frames into one video. But with Huawei’s system, an animated video of any desired length can be generated to suit, which presents a new, rapid method of animation production.

This state-of-the-art technology has already helped win awards; the short film To Dear Me, won the Best Film award at the 14th Beijing International Film Festival AIGC Short Film Section. The animated movie was built on a partnership between the Communication University of China, Ainimate Lab, and Huawei Cloud.

The film, which was converted from a real-person short video, features a considerable amount of dancing. With traditional animation techniques this would have been very difficult to replicate, however Huawei’s cutting edge technology smoothed the process significantly.

Virtual people

While we immerse ourselves in movies and animations, we are still drawn to the real world and more local and personalized storytelling. But AI is being used to great effect there too.

The news publication China Youth Daily works with Huawei Cloud MetaStudio to create virtual avatars for its journalists. Videos for political news, or stories about science, education, and sports are presented on various platforms, diversifying news content and breathing new life into traditional media productions. The main selling point is that AI-infused video production saves a large amount of time and effort spent on scene creation, video shooting, and editing, leaving the journalists more time to do what they love – finding and telling great stories.

As the technology evolves, Huawei sees multiple use cases for virtual humans, such as virtual news anchors, program hosts, or even marketing promoters. What’s more, MetaStudio already supports modeling for complex scenarios such as walking and holding objects. The virtual humans created on MetaStudio will feature a lip sync accuracy of more than 95%, which means they will be capable of expressing emotions and speaking multiple languages almost instantly.

Generative AI has the capacity to revolutionize the media industry, opening doors to projects and techniques never before thought possible.

For more information on Huawei’s media offerings click here.

TOPICS

ITPro is a global business technology website providing the latest news, analysis, and business insight for IT decision-makers. Whether it's cyber security, cloud computing, IT infrastructure, or business strategy, we aim to equip leaders with the data they need to make informed IT investments.

For regular updates delivered to your inbox and social feeds, be sure to sign up to our daily newsletter and follow on us LinkedIn and Twitter.