Is AI Safety a Potemkin Village?
Why disclosures ought to include the funding level and operational metrics of AI safety
Our thinking about AI safety is shaped both by examples of effective regulation from the past and from other fields, and by examples of what went wrong. One of our big takeaways from history is that some of the the most effective regulatory regimes do not put in place a host of detailed rules, or an army of government inspectors breathing down everyone’s back. Instead, they identify a problem, and put the onus for dealing with it on those closest to it.
Take for example the Fair Credit Billing Act of 1974, which limited consumer liability for credit card fraud, ensuring that banks and credit card companies had an incentive to invest in robust fraud detection, reporting, and enforcement.
This notion came back to mind this week when reading Matt Levine’s Bloomberg column on the $1.8 billion fine on TD Bank for the failure of its controls on money laundering: “The Justice Department would like banks to spend more money catching criminals, and it can’t quite make them. Except obviously it can. The Justice Department can’t directly set banks’ AML [Anti Money Laundering] budgets, but it can do it indirectly, and it just did.”
Levine cites Assistant Attorney General Lisa Monaco’s remarks in the press release announcing the fine:
“The Bank Secrecy Act includes a unique penalty provision: the ability to fine a financial institution up to $500,000 for each day it lacks a functional anti-money laundering program.
The daily fine provision is rarely used.
In fact, the Justice Department has never before sought this maximum daily penalty against any financial institution.
Until now….
We are putting down a clear marker on what we expect from financial institutions — and the consequences for failure.
When it comes to compliance, there are really only two options: invest now – or face severe consequences later.
As I’ve said before, a corporate strategy that pursues profits at the expense of compliance isn’t a path to riches; it’s a path to federal prosecution.”
This approach has interesting implications for AI governance. One of the questions regulators ought to be asking is whether sufficient resources have been applied to whatever axes of AI safety have been identified, to ensure that the capabilities are there to address and remediate the risks. In some ways, the easiest way to figure this out is to follow the money. If AI safety is only a shell rather than a well funded part of the ongoing operation, it is likely a Potemkin village.
Once real money is spent on AI safety, we’ll know the companies are taking it seriously. We know that the fintech wild west has ended because, as The Information put it just the other day, “Compliance Pros in Hot Demand as Crypto and Fintech Firms Staff Up.” The corresponding stories about AI companies today are all about the dissolution of AI safety teams, the departure of key members or their firing when they raise issues. This is a pretty clear tell that AI safety is not a priority.
Many of the AI safety disasters that we may be lamenting in years to come will be just like the failures at TD Bank and the crypto wild west that came to an end in 2022: a failure to invest in the infrastructure and controls that the companies know is required.
This kind of failure isn’t just caused by short-sighted penny-pinching, as appears to have been the case at TD Bank. It might also be simply from a kind of under-investment that comes when a company “moves fast and breaks things” or when a company simply fails to build out the infrastructure that is required in practice to provide the safety controls that they have identified in theory. As an internet wag once wrote, “the difference between theory and practice is always greater in practice than it was in theory.”
This is where looking at the lessons of recent history in social media governance can be helpful. For example, Facebook’s role in the Myanmar massacre was not a failure to have built controls on hate speech, but the inability to apply those controls in a country where they did not have local language competencies, either in the software or in the staffing.
“AI safety” will continue to be meaningless until we begin to realize and measure the potential detrimental impacts on the spectrum of humanity’s existence. For example, does the use of AI generated content based on social media attention algorithms increase teen suicide rates?