We analyzed over 9,000 Generative AI Papers…

How Corporate AI Research on Reliability and Safety Ignores Real-World Risks

and

May 06, 2025

New data shows that only 4% of generative AI papers address real-world deployment impacts, while lawsuits mount in precisely these neglected areas.

This post summarizes our latest working paper at the AI Disclosures Project, by myself, Isobel Moure, Tim O’Reilly, and Sruly Rosenblat, entitled: “Real-World Gaps in AI Governance Research: AI safety and reliability in everyday deployments.” We reviewed 9,439 papers in generative AI research covering both the five leading AI corporations and six prominent academic research institutions in the U.S. and found that fewer than one in twenty Corporate AI papers examine what happens after an AI system meets real users and business contexts.

Corporate AI labs now set the agenda

Our new working paper analyzes corporate labs’ research agendas, highlighting both their relative priorities and the gaps they leave unaddressed. Corporate labs increasingly set the AI research agenda, we found.1 Anthropic, OpenAI, and Google DeepMind each have more citations for their AI safety & reliability work than any U.S. university in our sample (Figure 1), highlighting their dominant impact on the field.

Figure 1. Total Citations for Safety & Reliability AI Research

Note: Fractionally adjusted for each institution's relative authorship contribution to each paper. Not showing numbers for a category with less than 150 citations. The eight categories are chosen and defined by authors and then categorized using GPT 4o-mini. See Paper Appendix for definitions.

DeepMind alone – with 69,453 adjusted citations – surpasses the top four universities combined for its general generative AI research (Table 1).

Table 1. Academic vs. Corporate Generative AI Research

Note: January 2020 through March 2025. All generative AI research adjusted for authorship. Google DeepMind combines `Google' and `DeepMind'. Each institution's papers and citation numbers are adjusted for their `fractional' contribution, based on the number of authors they have in the paper relative to a paper's total number of authors from other institutions.

This concentration of AI research with corporate labs is concerning, we argued, for two reasons:

Corporate AI research is fairly narrowly focused on things that apparently make their company’s products more usable. Fully 96% of corporate “safety” outputs stop at pre-deployment, pre-market, concerns such as alignment, benchmarking, and interpretability (Table 2).
Commercial activity is a major source of risk in post-deployment AI systems, yet those in the best position to monitor and understand those risks have commercial incentives to underplay them, rather than conduct transparent research on emerging problems. Only the remaining 4% of their research tackles persuasion, misinformation, medical or financial harms, copyright liability, or other risks that might only surface once the product ships and the model interacts with the real world (Table 2).

Table 2. Generative AI Research Papers by Risk Area and Context

Note: Institutional authorship is based on their relative contribution to the paper, based on number of authors compared to authors from other institutions.

OpenAI’s sycophancy debacle shows the gap

Sam Altman called algorithmic feeds “the first at-scale misaligned AIs,” in December 2024, yet all of OpenAI’s alignment research ignores their models’ integration into real-world commercial environments, replete with recommender systems, data, and other tools. OpenAI’s published studies rarely leave the sandbox – they perfect crash-test dummies without examining real-world collisions. The consequences of this were on display last week when, after its 4o model update, users started complaining of the model sycophantically agreeing with them.

OpenAI’s own post-mortem of the incident emphasized that in the pre-deployment testing, they ignored experts (“vibes”) and prioritized the A/B testing, which found positive impacts. This is what the media also highlighted. But the second half of their post-mortem mentions a more glaring mistake, that OpenAI didn’t think post-deployment monitoring for these behavioral risks were necessary (!):

“We also didn’t have specific deployment evaluations tracking sycophancy. While we have research workstreams around issues such as mirroring and emotional reliance⁠, those efforts haven’t yet become part of the deployment process”.

Our study shows this is a systematic, industry-wide pattern and speaks to our second point (2) above: That AI risks are not just “x-risks” (terminators), or malicious use risks (e.g., from scammers), but also stem from commercial incentives in the market acting on corporations to push their technology too far in search of profits, thereby exposing society to excessive risks.

Nathan Lambert argues instead that it was an honest technical mistake by OpenAI, not one driven by commercial incentives of any kind, since: OpenAI’s “ideology is quite culturally aligned with providing user value in terms of output rather than engagement farming, even if this is imposing a ceiling on their business relative to the potential of ads.”

Even if this is true, there’s a reason why this model update was pushed out so quickly, despite its pre-deployment testing processes showed conflicting results. And that’s because OpenAI is competing for market share. This “move fast and break things” market dynamic means that OpenAI increasingly lets competitive pressures dictate the pace of AI testing, often rushing past safety checks and exposing users — and society — to “commercialization risks”.

Anthropic risks falling into a similar trap – with its speculative and pseudo-scientific talk of Claude’s “values” and its “consciousness” – but it also has the tools to help it lead in post-deployment ongoing behavioral evaluations, namely through Clio, its impressive privacy-preserving conversation auditing tool.

Why outside scrutiny matters

AI companies should not be solely entrusted to evaluate and research their deployed applications for risks and harms, since they have economic and reputational incentives to downplay them. Consider how Facebook withheld internal research findings on how its engagement‑driven algorithms negatively affected users, shielding the evidence from public scrutiny. Now, Mark Zuckerberg is championing “AI-powered personalization loops” to further boost engagement on its platforms – even among teens, and in erotic contexts.

How should outside researchers, auditors, and policy makers be kept in the loop then? Companies own the ‘telemetry’ – prompts, outputs, reranked results, plugin calls, guard-rail triggers – that would allow for outsiders to research and audit AI impact once deployed. This data is closely held. This knowledge asymmetry leaves courts and regulators litigating in the dark while real-world harms escalate. The public remains reliant on piecemeal AI incident databases, old or overly aggregated user-LLM chat data, and public (vibes) testing of models. Real-world visibility into the effects of AI systems is negligible.

This is why we recommend in our working paper structured external access into deployed AI systems’ telemetry data and artifacts to systematically analyze real-world risks and harms. Monitoring and evaluation of LLMs in real-world environments is now essential to quality assurance (QA), as in ‘LLMOps’. But the data used for this is the preserve of corporate practice, resulting in society losing essential insight into AI’s ongoing risks and harms. Disclosure of AI system telemetry data (logs, traces, & business metrics) and LLM model data artifacts (e.g., training/fine-tuning datasets) may expose corporations to liability. But emerging LLM monitoring frameworks – such as those from LangSmith, Langfuse, OpenTelemetry, & Weights and Biases – make structured & standardized external API access for researchers increasingly feasible. Guidance from best practice in other industries is relevant here too. As are new tools, such as by OpenMined, for privacy-preserving external access.

More specifically, we propose a “tiered disclosure” system that can build on existing observability standards in the AI stack, such as the OpenTelemetry Protocol (OTLP) for sharing telemetry data. Then for high-risk AI applications, we recommend sharing three secure data streams (including relevant business metrics), covering:

Differentially private event logs summarizing system activity
Detailed system traces of inputs, outputs, and safety interventions
Model documentation listing version IDs, training information, and known limitations (“Model‑artifact manifests”).

Within this observability layer, public researchers could be able to access limited samples, verified auditors comprehensive query access, and regulators securely obtain more detailed information. To encourage adoption, we suggest liability safe harbors for researchers and for companies that implement these transparency measures in good faith. We are just at the beginning of what needs to be a comprehensive and practical approach to facilitating tiered external access to selective AI observability features.

Thank you to Changbai Li (Oregon State University) for spotting an error in the initial working paper. This is a first pass at a blueprint for disclosures grounded in existing observability structures in the market. We are hoping for feedback on this work in progress, so please let us know where you think the ideas can be improved upon and sharpened.

The full working paper by Ilan Strauss, Isobel Moure, Tim O’Reilly, and Sruly Rosenblat can be found here:
https://ssrc-static.s3.us-east-1.amazonaws.com/Real-World-Gaps-in-AI-Governance-Research-Strauss-Moure-OReilly-Rosenblat_SSRC_04302025.pdf

Replication code and supplemental materials live on GitHub:
https://github.com/AI-Disclosures-Project/Real-World-Gaps-in-AI-Governance-Research

At least by way of applied AI work. Our study does not distinguish between theoretical (fundamental) AI research and applied research.

A guest post by

Ilan Strauss

Blogging at @asimovaddendum. Co-Director of the AI Disclosures Project (Social Science Research Council). Senior fellow at University College London. Visiting associate professor at University of Johannesburg.

Asimov’s Addendum

Discussion about this post