OpenAI is being sued again - why is generative AI such a legal conundrum?
Legal woes continue to bedevil AI technology amid privacy and copyright concerns


OpenAI and Microsoft have both been named in a complaint filed in the Northern District of California US District Court, which alleges that the ChatGPT creator has been using personal data to train its models without first obtaining permission.
It marks the second major lawsuit launched against the company in the space of a month after it allegedly defamed a radio host by generating untrue statements about him.
The most recent legal accusations relate to the alleged non-consensual use of personal data. The plaintiffs involved have been left anonymous in the complaint, which is seeking class-action status, but $3 billion in damages was mentioned due to the “millions of class members” potentially involved.
RELATED RESOURCE
Three essential requirements for flawless data protection
Why you need a unified platform that covers all your cloud data channels
Of the defendants, the lawsuit alleges “unlawful and harmful conduct in developing, marketing, and operating their AI products, including ChatGPT-3.5, ChatGPT-4.0, Dall-E, and Vall-E”.
It goes on to say that the products listed “use stolen private information, including personally identifiable information, from hundreds of millions of internet users, including children of all ages, without their informed consent or knowledge”.
OpenAI’s models were trained by scraping data from the internet, according to the complaint.
“Despite established protocols for the purchase and use of personal information, defendants took a different approach: Theft,” it read.
Get the ITPro daily newsletter
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
While scraping data from webpages is not a new technique, the failure to seek consent coupled with the difficulty in extracting personal information for the resulting models has resulted in legal concerns.
The complaint read: "Through their AI products, integrated into every industry,” says the complaint, “defendants collect, store, track, share, and disclose private information of millions of users". These include:
- All details entered into the products
- Account information users enter when signing up
- Name
- Contact details
- Login credentials
- Emails
- Payment information for paid users
- Transaction records;
- Identifying data pulled from users’ devices and browsers, like IP addresses and location, including geolocation of the users
- Social media information
- Chat log data
- Usage data
- Analytics
- Cookies
- Key strokes
- Typed searches, as well as other online activity data
As well as the exhaustive list of data collected from the plaintiffs, the complaint also mentioned data collected through integrations with tools such as Slack and Microsoft Teams.
Is AI facing other legal issues?
Several lawsuits have been filed related to the performance of AI technology.
One challenging the legality of GitHub Copilot is currently working its way through the legal system. At question is how Copilot, an AI pair programmer capable of serving up coding suggestions, has been trained.
It is alleged that the use of code found in public GitHub repositories has violated the rights of developers or the licences under which the repositories have been made public. An attempt to have the case dismissed in May was denied, in part, meaning the lawsuit has longer to run.
“We firmly believe AI will transform the way the world builds software, leading to increased productivity and most importantly, happier developers,” a GitHub spokesperson told ITPro.
“We are confident that Copilot adheres to applicable laws and we’ve been committed to innovating responsibly with Copilot from the start. We will continue to invest in and advocate for the AI-powered developer experience of the future.”
The training of models is not the only aspect of generative AI to attract the attention of the legal profession.
OpenAI is also currently being sued for defamation after ChatGPT provided an inaccurate legal summary in response to a query.
The case centers on a complaint filed by Florida radio host Mark Walters that ChatGPT defamed him by erroneously claiming that he had been accused of financial crimes.
The technology has also come under the scrutiny of regulators. The Italian data protection regulator recently ordered OpenAI to implement a ‘right to be forgotten’ option amid concerns that the collection and processing of personal data is in violation of GDPR.
Why is AI having these problems?
Generative AI faces two major technical challenges.
The first is sourcing training data without infringing privacy or violating restrictions. OpenAI has taken the approach in the past of scraping any available data from webpages - including data entered into its tools - into its models.
While this has enabled the company to build up a large data set, questions are being asked regarding the rights it has around this gathering and use. It is this sourcing, or scraping, of training data that forms the basis for many of the complaints.
The second is the trust given to the output of generative AI platforms, particularly where a ‘hallucination’ has occurred. This is where an AI model has insufficient data to generate an accurate output.
This type of hallucination is what triggered the defamation lawsuit, and underlines the importance of checking the output of tools such as ChatGPT.
Generative AI is a relatively new technology that has only become practical in recent years with improvements in processing power and storage. Lawmakers are therefore working to catch up and understand its implications.
Amid widely spread fears regarding the risks posed by AI are genuine concerns around privacy and the use of personal data.
As an example, staffers in the US House of Representatives were recently instructed to only use ChatGPT Plus - which features privacy controls - and not paste sensitive data into the system.
Use of the technology has also been forbidden by a number of financial institutions and enterprises amid fears that sensitive financial information or company secrets might find their way into the chatbot’s models.
OpenAI recently updated its data usage and retention policies to permit customers to opt out of data sharing, although that policy did not apply to data submitted to the API before 1 March 2023 and doesn’t apply to OpenAI’s non-API consumer services, including ChatGPT.

Richard Speed is an expert in databases, DevOps and IT regulations and governance. He was previously a Staff Writer for ITPro, CloudPro and ChannelPro, before going freelance. He first joined Future in 2023 having worked as a reporter for The Register. He has also attended numerous domestic and international events, including Microsoft's Build and Ignite conferences and both US and EU KubeCons.
Prior to joining The Register, he spent a number of years working in IT in the pharmaceutical and financial sectors.
-
Bigger salaries, more burnout: Is the CISO role in crisis?
In-depth CISOs are more stressed than ever before – but why is this and what can be done?
By Kate O'Flaherty Published
-
Cheap cyber crime kits can be bought on the dark web for less than $25
News Research from NordVPN shows phishing kits are now widely available on the dark web and via messaging apps like Telegram, and are often selling for less than $25.
By Emma Woollacott Published
-
OpenAI woos UK government amid consultation on AI training and copyright
News OpenAI is fighting back against the UK government's proposals on how to handle AI training and copyright.
By Emma Woollacott Published
-
DeepSeek and Anthropic have a long way to go to catch ChatGPT: OpenAI's flagship chatbot is still far and away the most popular AI tool in offices globally
News ChatGPT remains the most popular AI tool among office workers globally, research shows, despite a rising number of competitor options available to users.
By Ross Kelly Published
-
‘DIY’ agent platforms are big tech’s latest gambit to drive AI adoption
Analysis The rise of 'DIY' agentic AI development platforms could enable big tech providers to drive AI adoption rates.
By George Fitzmaurice Published
-
OpenAI wants to simplify how developers build AI agents
News OpenAI is releasing a set of tools and APIs designed to simplify agentic AI development in enterprises, the firm has revealed.
By George Fitzmaurice Published
-
Elon Musk’s $97 billion flustered OpenAI – now it’s introducing rules to ward off future interest
News OpenAI is considering restructuring the board of its non-profit arm to ward off unwanted bids after Elon Musk offered $97.4bn for the company.
By Nicole Kobie Published
-
Sam Altman says ‘no thank you’ to Musk's $97bn bid for OpenAI
News OpenAI has rejected a $97.4 billion buyout bid by a consortium led by Elon Musk.
By Nicole Kobie Published
-
DeepSeek flips the script
ITPro Podcast The Chinese startup's efficiency gains could undermine compute demands from the biggest names in tech
By Rory Bathgate Published
-
SoftBank could take major stake in OpenAI
News Reports suggest the firm is planning to increase its stake in the ChatGPT maker
By Emma Woollacott Published