Who's Tackling Classified AI?
Innovative tools could collect the views of U.S. national security officials about what kinds of defense and intelligence AI we should use.
Published by The Lawfare Institute
in Cooperation With
On April 8, the Washington Post reported that members of Congress have “vowed to tackle AI.” The article describes the anxiety that is growing among lawmakers as they try to get a handle on what recent advances in artificial intelligence portend. “Something is coming. We aren’t ready,” Sen. Chris Murphy (D-Conn.) tweeted.
It will be good news if members of Congress are able to get smarter about what AI tools are, what they can and can’t do, and what good and bad uses of AI will realistically look like. But the work shouldn’t stop with unclassified AI systems alone. Classified AI tools that U.S. national security agencies build and use should also be consistent with the basic values we expect of the U.S. government: legality, competence, effectiveness, and accountability. Innovations in the unclassified setting can provide ideas on how to do so.
Regulating in the Unclassified Space
Almost all of the discussion about what Congress—and the Biden administration—might do to regulate and de-risk AI is focused on domestic and unclassified manifestations of AI tools. Government actors are concerned about ways in which AI tools such as ChatGPT will help malicious actors commit fraud and spread propaganda and misinformation. They are also worried about the prospect that AI will replace people’s jobs. One set of responses to AI development could be legislative, informed by congressional hearings and lobbying on Capitol Hill by companies that produce AI systems. Another could be technocratic, led by actors like the Justice Department, the Federal Trade Commission, and the National Institute of Standards and Technology. A third—and ambitious—approach, one proposed by Rep. Ted Lieu (D-Calif.), could be a combination of the first two: a congressionally created government commission to assess AI risks and potentially a new federal agency to oversee AI.
A fourth response, which could inform the substance of any of the first three approaches, would be to ask the public about how it wants the government to protect it against AI’s various risks and misuses. The Department of Commerce just took a first step in that direction, putting out a “request for comment” on how Commerce could regulate AI systems to assure users that the systems are legal, effective, and safe. The idea of opening up the conversation to the public more broadly is appealing. These systems already affect all of us, and the public (which includes computer scientists, ethicists, lawyers, and policymakers, as well as victims of AI fraud or abuse) could usefully contribute examples of amazing or terrible uses of AI systems, insights about what types of regulations have worked well in comparable technologies or societies, and broader reflections about the kind of society we do and don’t want to live in. Indeed, a recent opinion piece in the New York Times advocated for a broader public conversation about AI policy, arguing that “AI can fix democracy and democracy can fix AI.”
The United States would not be the first government to try this. Taiwan has developed systems called vTaiwan and Join that facilitate participatory governance. vTaiwan is an “online-offline” consultation process that brings together experts, government officials, scholars, business leaders, civil society groups, and citizens to deliberate, reach consensus, and craft legislation. And the Taiwanese government has agreed that it will use the opinions gathered through the process to shape legislation on the digital economy. The motivating idea behind vTiawan is that it is possible to develop consensus on deadlocked issues by breaking down topics into discrete propositions and identifying areas where different sides can find agreement. Using a system called Pol.is (which deploys machine learning), vTaiwan was able to reach consensus about how UberX was allowed to provide services in Taiwan (in light of strong opposition from local taxi drivers). It also drafted a proposed bill to regulate online alcohol sales.
The UberX process started with a Pol.is poll that helped identify areas of disagreement and consensus statements about a proposal. It then progressed to professionally facilitated face-to-face stakeholder conversations, with the goal of producing formalized consensus for presentation to the legislature. Pol.is allows people to post comments and up- or downvote others’ comments but not reply to them. This minimizes users’ ability to troll. Further, the system uses the up- and downvotes to generate maps of those participating based on how people have voted. This map shows where there is consensus and where there are divides. People can then draft and refine comments to bridge those divides; if successful, those propositions will garner more upvotes and, hopefully, ultimate consensus.
Although Taiwan is comparatively small and homogeneous and, so, seems like a particularly conductive setting in which to deploy this process, France, Belgium, Mexico, Spain, and Iceland also have tested the use of “computational democracy” to improve the quality and legitimacy of legislation. A process that collects, parses, and clusters views from many thousands of people about discrete propositions and then moderates discussions to work toward consensus on specific topics is worth paying attention to as Congress and the executive move forward on AI.
Regulating in the Classified Space
But collecting a wide range of perspectives about the costs and benefits of using various AI systems and working to identify consensus propositions should not happen only in the public setting. To date, Congress and the White House are largely focused on public uses of AI. But government leaders also must consider what is happening inside the national security agencies, behind the veil of classification. The departments of Defense and Homeland Security, as well as agencies within the intelligence community, are undoubtedly asking themselves, just as the U.S. public is, how large language models will change their jobs, from both an offensive and a defensive perspective. It’s possible that national security agencies are several steps ahead of the general public in terms of AI use, already developing their own versions of these tools and integrating AI into command-and-control systems, supply chain tracking, intelligence analysis and collection, and weapons.
But of course we don’t have detail about how these agencies are approaching AI because U.S. national security operations are generally a black box. That’s often by necessity: The leaks that came to light over the past week show why the government often must operate in secret to protect its military advantage and ensure trust among allies. But it means that we don’t have a nuanced understanding of where the Defense Department and the intelligence community are heading. To the credit of those agencies, they have released several AI policies that reflect basic values that are consistent with good government: reliability, safety, accountability, lack of bias, and so on. But those principles and policies are written at a pretty high level of generality. They won’t answer many of the hard questions that will arise (or maybe already have arisen) as these agencies consider what AI tools to use to defend the United States. Should the CIA be willing to use deepfakes to affect foreign elections? If the Defense Department decides to use ChatGPT-like tools to conduct deception operations against a set of foreign users, how will it ensure that those tools won’t spread and “blow back” into the United States? What level of large language model “hallucination” should U.S. national security agencies be willing to tolerate? Are there some uses of AI that these agencies should take off the table, even if we know that our adversaries won’t?
Many of these questions are easier to ask than to answer. Congress will find this true as well, especially because issues of classification make it impossible to discuss possible uses of AI publicly. One place to start, though, is to obtain the views of a wide range of national security experts who spend their working hours behind the veil of secrecy. Using tools that operate like Pol.is, the executive branch could first give all participants basic training in different types of AI, what AI systems can and cannot do, and how they have been used in the real world. The executive could then pose a range of questions about possible uses of national security AI, including realistic hypotheticals, for relevant officials within the departments of Defense, Justice, Homeland Security, and State, as well as the intelligence community, to wrestle with. Moderators could then try to work toward consensus positions on various potential uses (or non-uses) of AI and share their findings with senior national security policymakers.
One initial objection may be that this exercise would let the fox guard the henhouse. However, the national security bureaucracy is surprisingly diverse in terms of experience, perspective, political persuasion, and training. A career foreign service officer and a CIA field agent will not see issues identically. Nor will a Defense Department cyber specialist and a Justice Department lawyer in the National Security Division. Further, this would be a way to assemble a large number and wide range of views about classified uses of AI—something that is otherwise difficult to collect input on. Third, views of executive branch officials are not the only input worth obtaining: Congress and the executive will surely hear views from technology companies, foreign allies, and nongovernmental organizations. But gaining views from national security professionals—whether they be intelligence analysts, diplomats, computer scientists, or military operators—would be invaluable as we shape a collective national approach to AI.
Finally, as I have argued recently on these pages, the chance of obtaining consensus among leading AI states about particular uses of national security AI is very slim. This means that it is necessary and important for the United States to think clearly and critically—both in front of the curtain of classification and behind it—about what uses of these tools we want our government to use in our name.