JAKARTA - The latest report reveals that Apple has used YouTube videos to train their AI model, Apple Intelligence. This is said to be a violation of the platform's content policy.

An investigation conducted by Proof News, published alongside Wired, states that Apple and several other technology companies, including Nvidia and Anthropic, use publicly available data generated by users to train their AI models.

According to the investigation, Apple uses a dataset called YouTube totaling transcripts from 173,536 YouTube videos from more than 48,000 channels. Videos in this dataset include various types of content, ranging from educational channels such as Khan Academy and MIT, to news sites such as The Wall Street Journal, as well as several leading creators on the platform such as MrBeast and Marques Brownlee.

Marques Brownlee stated that Apple is technically avoiding errors because they get their AI from companies that use transcripts from YouTube videos, not using those data directly. However, these data and transcripts still contribute to AI models, where creators have invested their time and money. Brownlee concluded that this would be a growing problem for a long time.

Proof News also makes tools for creators to search for their content in the dataset. YouTube database totals no images from the video, but includes some subtitles translated in various languages. The dataset is reportedly created by a nonprofit research laboratory called Eleuther AI, which focuses on promoting open science norms.

None of the companies mentioned in this report immediately commented on this issue. YouTube CEO Neal Mohan has clearly stated in an interview that companies using YouTube videos to train their AI models constitute "clear violations" of the platform's policies.


The English, Chinese, Japanese, Arabic, and French versions are automatically generated by the AI. So there may still be inaccuracies in translating, please always see Indonesian as our main language. (system supported by DigitalSiber.id)