In a recent turn of events, Meta CEO, Mark Zuckerberg, employed YouTube’s piracy struggle as a backdrop to justify the use of copyrighted materials in an AI copyright case, revealed by a summary of a deposition from last year.
The deposition was part of Kadrey v. Meta, one of the numerous copyright lawsuits that creators and AI companies are currently engaged in. Usually, the latter justify their training on protected content as ‘fair use’, although many creators argue otherwise.
Zuckerberg invoked YouTube in trying to highlight a potential area of subjectivity in the fair-use debate. He argued that while YouTube, like Meta, might unintentionally host some pirated content, its intentions are to remove such materials, suggesting a majority of it is legit.
Zuckerberg illustrated Meta’s stance of ‘fair use’ using an instance of LibGen, a data set of e-books, to train its AI models known as Llama; going head-to-head with models from the likes of OpenAI. Despite previous legal implications due to hosting copyrighted contents from multiple publishers, Meta internally approved the use of LibGen.
Counsel for the complainants, including popular authors Sarah Silverman and Ta-Nehisi Coates, presented filings claiming Meta was aware of LibGen’s ‘pirated’ status. Despite this, Zuckerberg professed limited knowledge about LibGen, yet explained a broad-prohibition on such data sets as being unreasonable.
The fresh iteration of the complaint in the Kadrey v. Meta case alleges Meta cross-referenced pirated content within LibGen against legally-available books, presumably to decide if a licensing agreement with a publisher would be worthwhile. Furthermore, it also claimed that Meta’s latest and upcoming models were trained on the infamous ‘pirated’ datasets.
Original source: Read the full article on TechCrunch