close
close

topicnews · October 15, 2024

4 Ways to Promote Transparency in Frontier AI Development

4 Ways to Promote Transparency in Frontier AI Development

TThe debate over California’s AI regulation bill SB 1047 is over – after months of heated debate, Governor Gavin Newsom vetoed it at the end of September. This was America’s first major public discussion of AI policy, but it won’t be our last. So now seems like the right time to take stock, reflect on what we have learned and suggest a way forward.

As a policy analyst specializing in AI and a former OpenAI governance and alignment researcher, we disagreed and still disagree on the merits of SB 1047. But when we talked privately, we found that we largely agreed on our assessment of the current state of AI. These points are the foundation for a productive path forward.

Fundamentally, we both share a deep uncertainty about the development of AI capabilities. The range of changes—both good and bad—that forecasters believe are possible within the next decade is a reminder of how little we know about the endeavor we have undertaken. Some believe AI will have only a moderate impact for at least the next decade, perhaps similar to the invention of word processing. Others believe it will be more akin to the introduction of smartphones or the Internet. Still others believe that AI will transition to artificial general intelligence (AGI) – the rise of a new species that will be more intelligent and generally more capable than humans.

Deep learning is a discovery, not an invention. Artificial neural networks are trained, not programmed. We are in uncharted territory and are reaching the outer limits every day. Surprises are possible, perhaps even likely.

Unfortunately, the findings are not evenly distributed. Cutting-edge AI research is increasingly being conducted by the largest AI companies – OpenAI, Anthropic, DeepMind and Meta. Unlike in the past, this research is rarely published publicly. Each of the leading AI companies has shared that they believe AI policy should be set in a conscious and democratic manner. But such consideration is simply impossible when the public and even many subject matter experts have no idea what is being built and therefore do not even know what we are trying to regulate or what we want to do not regulate.

Many negative consequences of this information asymmetry are foreseeable. A misinformed public could pass clueless laws that end up doing more harm than good. Or we could do nothing at all, even though a political response would actually be appropriate. It is even conceivable that systems with world-changing capabilities will be built more or less completely in secret; In fact, this is much closer to the status quo than we would like. And once built, these high-performance future systems could be known only within AI companies and perhaps the upper echelons of the American government. This would represent an imbalance of power that is fundamentally inconsistent with our republican form of government.

Transparency, which carries fewer downsides and is more likely to gain widespread support than other potential regulations, is key to AI success. We propose a framework that allows the public to gain insight into the capabilities – both useful and potentially harmful – that we can expect from future AI systems.

Our proposal includes four measures. Each could be implemented as a public policy or a voluntary commitment by companies, and each on its own would meaningfully advance our goal of promoting greater public insight into groundbreaking AI development. Taken together, they would create a robust transparency framework.

Disclosure of developing skills

Rather than forcing companies to disclose proprietary details about how a capability was obtained, we propose only disclosing the fact that the capability was obtained. The idea is simple: When a frontier laboratory first determines that a novel and important capability (as measured by the laboratory’s own security plan) has been achieved, the public should be informed. Public disclosure could be facilitated by the federal government’s AI Safety Institute. In this case, the identity of the laboratory in question could remain anonymous.

Disclosure of training goal/model specification

Anthropic has released the constitution it uses to train models and the system prompts it gives to Claude. OpenAI has released its model specification, a detailed document that describes the ideal behavior of ChatGPT. We support regulations, voluntary commitments or industry standards that create the expectation that documents such as these are created and made available to the public.

There are three reasons for this: First, they allow users to identify whether behavior is intentional or unintentional. Instances of unintended behavior can then be collected and used to advance basic alignment science. Second, publishing these documents increases the likelihood that problems will be identified and fixed quickly, just as open source code tends to make it more secure. Third and most importantly, as AI systems become much more powerful and integrated into our economy and government, people deserve to know what goals and principles they are being trained to achieve, as part of basic democratic rights.

Public discussion of security cases and potential risks

This proposal would include a requirement for pioneering AI companies to publish their safety credentials – their explanations of how their models pursue intended goals and obey intended principles, or at least their justifications for why their models do not cause disaster. Because these security cases involve internal models and potentially involve sensitive intellectual property, the public version of these documents could be released with redactions. These security cases would not be subject to regulatory approval; Instead, the purpose of sharing them is so that the public and scientific community can evaluate them and potentially contribute with feedback.

Whistleblower protection

Protecting whistleblowers is essential to holding AI companies accountable to the law and their own obligations. After much public debate and feedback, we believe that SB 1047’s whistleblower protections, particularly protections for employees of AI companies who report violations of the law or extreme risks, are fairly close to ideal. We propose to use the whistleblower protections in the latest version of the bill as a starting point for further discussion.

We believe the AI ​​community agrees on more issues than disagrees. Almost all of us are excited by the prospect of positive technological change. Almost all of us see serious problems with the current status quo, whether in our politics, our governance, or the state of scientific and technological progress. We share hope for a future made possible by radical AI advances. In short, we believe we are likely to be long-term allies, even if many of us bitterly disagree about the merits of SB 1047.

If AI proves to be as transformative as we believe, it will challenge many aspects of the American status quo. To do this, we must build new institutions – perhaps even a new conception of statecraft itself. This is a monumental intellectual task, and our community will not be up to it if our debates are reduced to petty partisan disputes. In this sense, AI is more than just a technological opportunity; It is an opportunity for us to refine American governance. Let’s grab it.

Dean W. Ball is a research associate at the Mercatus Center and author of the newsletter Hyperdimensional. Daniel Kokotajlo is Executive Director of the AI ​​Futures Project.

More must-reads from TIME


Contact us at [email protected].