JAKARTA - YouTubs Shorts will soon have a description along the video screening, thanks to the merger work between DeepMind and Brain teams into one major Artificial Intelligence (AI) team on Google.

Dubbed Google DeepMind, they detailed how one of the Visual Language Models (VLMs) was used to make a description of the TikTok competing service.

YouTube Shorts was created in just a few minutes and often did not include useful descriptions and titles, making it harder to find through searches. So we introduced our visual language model Flamingo to help generate descriptions," the Google DeepMind team said in its official blog, quoted Thursday, May 25.

Flamingo can create a description by analyzing the initial frame of the video to explain what happened. This tool is capable of making descriptions for all new Shorts videos.

The description of the text will be saved as a metadata to better categorize the video and match the search results with the audience query.

"Now, users can watch more relevant videos and find more easily what they are looking for from various global creators," said the Google DeepMind team.

In addition, Google DeepMind is also implementing their AI research to improve YouTube's experience by cooperating with the YouTube technical team and products.

"We have helped optimize the decision-making process that improves security, reduce latency, and improve audience experience, creators and advertisers for all," said Google DeepMind team.

The Google DeepMind team realizes that the total number of internet traffic is expected to increase in the future, so video compression is an increasingly important problem.

They explored the potential of the AI model, MuZero to YouTube to improve codec VP9, a coding format that helps compress and send videos via the internet. Then, the team applied MuZero to several YouTube direct traffic.

"Since its launch to the production stage in part of YouTube's direct traffic, we have shown an average 4 percent reduction in bit speed in a large and diverse collection of videos," said Google DeepMind team.

"Bitrate helps determine the computing and bandwidth capabilities needed to play and store video affect everything from how long the video is loaded to the resolution, buffering, and use of data," he added.

Since 2018, Google DeepMind has collaborated with YouTube to better educate creators about the types of videos that can earn revenue from advertising, and ensure ads appear alongside content that follows content guidelines that are suitable for YouTube advertisers.

Together with the YouTube team, we developed a label quality model (LQM) that helps label videos more accurately, according to YouTube ad-friendly guidelines. The model improves the accuracy of the ads shown in videos in line with YouTube's advertising-friendly policies," explained the Google DeepMind team.

Lastly, to improve creator and audience experience, the Google DeepMind team developed an AI system that can process video transcripts, audio and visual features automatically, as well as suggest a chapter segment and title for YouTube creators.

As Sundar Pichai introduced in Google I/O 2022, segments automatically made are available for 8 million videos today, and the team plans to scale the feature to more than 80 million videos over the next year.

Using AutoChapters, users will spend less time looking for certain content and content creators can save time creating parts for their videos.


The English, Chinese, Japanese, Arabic, and French versions are automatically generated by the AI. So there may still be inaccuracies in translating, please always see Indonesian as our main language. (system supported by DigitalSiber.id)