While OpenAI CEO Sam Altman has openly admitted that it’s virtually impossible to develop advanced AI models like ChatGPT without copyrighted content, he argues that copyright law doesn’t categorically prohibit AI firms, ultimately leveraging “the fair use doctrine” to violate copyright law and destroy the internet.
More recently, Microsoft was forced to delete a blog post it had published in November 2024, which seemingly encouraged developers to pirate Harry Potter books to train AI models following backlash from critics in a Hacker News thread.
The dataset was marked as Public Domain by mistake; there was no intention to misrepresent the licensing status of the works.
Shubham Maindola, Data Scientist
The dataset was deleted late last week after the outlet reached out to Shubham Maindola, a data scientist in India with no known affiliations to Microsoft. “The dataset was marked as Public Domain by mistake,” Maindola told Ars Technica. “There was no intention to misrepresent the licensing status of the works.”
Developing generative AI is no easy feat. Top AI research labs, such as OpenAI, are quickly burning through substantial funds to maintain the hype amid rising concerns among investors about returns on their investments. The ChatGPT maker is reportedly on-course to make a $14 billion loss in 2026 before going into bankruptcy by mid-next year.
The money aside, AI models heavily rely on information from the internet for training. However, reports suggest that Google, OpenAI, and Anthropic are suffering from a lack of high-quality data for model training, slowing down the advances in AI development.
Do you think AI model training constitutes copyright infringement?
AI model training has always been a complex issue, largely because there are no clear laws preventing tech companies from using copyrighted material in the process. Many firms lean on the concept of fair use as a legal shield, arguing that their practices fall within its protections.
Join us on Reddit at r/WindowsCentral to share your insights and discuss our latest news, reviews, and more.