Lawsuit Indicates CEO Mark Zuckerberg Endorsed Training AI on Bootleg Materials

In an unfolding copyright court case, legal representatives for the plaintiffs accuse Meta’s CEO, Mark Zuckerberg, of authorizing the use of illicitly-distributed e-books and papers for training the company’s Llama AI.

The dispute, Kadrey v. Meta, is among many against leading tech corporations that allegedly use copyrighted content to instruct AI models, disregarding the necessity for obtaining consent. Defendants such as Meta have typically cited ‘fair use’ as a defense, a doctrine permitting new creations from copyrighted works if adequately transformative – a concept disputed by numerous authors.

Recently disclosed court papers indicate Zuckerberg endorsed the utilization of a notorious dataset, LibGen, infamous for providing access to unlicensed works from esteemed publishers. This controversial move allegedly took place amidst internal disapproval within Meta’s AI leadership.

Previous reports from The New York Times align with these details, suggesting disruptive practices for sourcing AI data at Meta. Supposed new allegations suggest Meta might have tried to hide the alleged violations by anonymizing the LibGen data.

The most recent court document indicates Meta ventured into another questionable area – using torrenting to access LibGen, which involves distributing files online and potentially spreading bootleg content. Meta’s Head of Generative AI, Ahmad Al-Dahle, reportedly authorized this torrenting activity, despite legal reservations expressed within the team.

While the final verdict lies in an uncertain future and only covers Meta’s initial Llama models, these allegations may bring potential damage to Meta’s reputation. Judge Vince Chhabria, presiding over the case, expressed this sentiment in an order denying Meta’s request to conceal extensive sections of the lawsuit details.

Original source: Read the full article on TechCrunch