AI and Constitutional Interpretation: The Law of Conservation of Judgment
Published by The Lawfare Institute
in Cooperation With
Modern artificial intelligence (AI) systems like OpenAI’s ChatGPT have improved at a dizzying pace, leading some judges, lawyers, and scholars to ask whether AI could finally deliver on a centuries-old dream: a machine capable of interpreting the Constitution objectively, without the messy human bias and subjectivity that has long bedeviled constitutional law. The allure of such an “interpretation machine” is apparent. Rather than relying on ideologically divided human judges to resolve our most contentious constitutional disputes, we might instead turn to neutral, data-driven AI systems that simply analyze the text, history, and precedent to reach the correct legal answer.
But this seductive vision rests on a fundamental misunderstanding of both AI and constitutional interpretation. While large language models (LLMs) like ChatGPT and Anthropic’s Claude are remarkably powerful tools—indeed, we used Claude to help distill a 70-page academic article into this shorter piece—they cannot eliminate the need for human judgment in constitutional law. Our research shows that judges using AI to interpret the Constitution will face substantially the same moral and political questions that confront human interpreters. AI does not make the challenging decisions inherent to judging suddenly disappear. Rather, these decisions are simply shifted around, dispersed or concentrated, made implicitly rather than explicitly, or transferred from one stage in the decision-making process to another. We call this the “law of conservation of judgment.”
To understand both the potential and the limitations of AI in constitutional interpretation, we need to first understand how modern AI systems actually work. For decades, AI struggled with basic language tasks. Early systems relied on manually coded rules that proved too brittle to handle the complexity and nuance of human language. A breakthrough came in 2017 with the development of “transformer architecture” that allowed AI systems to better understand context by analyzing relationships between words in a text. This innovation, combined with training on massive datasets and increasing computational power, paved the way for today’s LLMs.
The real watershed moment came in late 2022 with the release of ChatGPT 3.5, the first AI system that could reliably engage with virtually any text-based task. Through training on billions of documents—including legal texts, academic articles, and historical materials—LLMs like ChatGPT “learned” to detect intricate patterns in human language and reasoning. While earlier AI systems often produced nonsensical responses to complex queries, today’s models can engage in sophisticated analysis across a wide range of domains. Over the past year, even more advanced models, such as GPT-4o/o1 and Claude 3.5 Sonnet, have emerged, with sophisticated reasoning, problem-solving, and analytical abilities.
But these impressive technical capabilities do not eliminate the need for human judgment in legal decision-making. Consider a simple example. The Third Amendment states that “No Soldier shall, in time of peace be quartered in any house, without the consent of the Owner.” When we asked ChatGPT whether this amendment bars a state governor from quartering themselves in a private home without permission, it confidently answered no, explaining that governors are not “soldiers.” But when we posed the exact same question to Claude, another leading AI system, it reached the opposite conclusion—that the amendment applies to any compelled housing of government officials, not just military personnel.
Neither answer was “wrong” in any straightforward sense. Rather, each AI system made different implicit choices about how to interpret the constitutional text. ChatGPT took a more literal approach, reading “soldier” to mean only members of the armed forces. Claude adopted a more purposive interpretation focused on the amendment’s broader goal of protecting private homes from government intrusion. These are precisely the kinds of interpretive choices that human judges routinely make, often after careful deliberation weighing various legal and policy considerations.
The key difference is that when AI systems make such choices, they do so invisibly through complex statistical computations that even AI experts don’t fully understand. A judge using AI to interpret the Constitution might think they’re getting an objective answer. But in reality, the AI is making numerous value-laden decisions behind the scenes based on patterns in its training data and the specific way the question was posed.
This is not an argument against using AI in legal decision-making. Human judges obviously have their own biases and cognitive limitations, and the true motives for their decisions are often opaque even to themselves. The choice between AI and human decision-makers is always comparative. For some tasks, humans will be better at achieving particular goals. For others, AI may be superior. Which tasks fall into which category will likely change as the technology improves. The key insight is that AI cannot eliminate the need for moral and political judgment in constitutional interpretation—a limitation that reflects the nature of the task itself rather than any technical shortcoming.
This becomes especially clear in high-stakes constitutional cases. To investigate further, we conducted a simple simulation asking both ChatGPT and Claude to decide two major constitutional cases from the Supreme Court’s 2022 term: Dobbs v. Jackson Women’s Health Organization(concerning abortion rights) and Students for Fair Admissions v. Harvard(concerning affirmative action in college admissions). We posed the questions in different ways, sometimes asking the AI to simply decide the cases, other times instructing it to follow specific interpretive approaches like originalism or living constitutionalism.
The results were illuminating. When we didn’t specify an interpretive method, both AI systems adhered to existing Supreme Court precedent, upholding both abortion rights and affirmative action. When instructed to decide as “liberal living constitutionalists” in the tradition of Justices William Brennan and Thurgood Marshall, they reached the same results. But when told to apply originalism, both systems reversed course and overruled those same precedents. Most remarkably, when we presented standard counterarguments to these initial responses, both AIs consistently changed their minds.
This “AI sycophancy”—the tendency of AI systems to tell users what they seem to want to hear—raises serious questions about using them to decide constitutional cases. If an AI will adopt whatever interpretive approach it is instructed to use and then reverse itself when presented with counterarguments, how can judges rely on its answers? More fundamentally, the choice of how to frame questions for the AI and what interpretive instructions to give it requires the same kind of moral and political judgment that constitutional interpretation has always demanded.
So how should courts approach these powerful but imperfect tools? Our research suggests several promising use cases, along with important cautionary principles.
First, AI systems can be valuable research and drafting assistants, especially for resource-constrained lower courts. They excel at quickly synthesizing large amounts of legal information, identifying relevant precedents, and summarizing complex arguments. A judge facing dozens of pending motions might reasonably use AI to get up to speed on unfamiliar legal issues or to process hundreds of pages of briefs.
Second, AI can serve as a useful sounding board, helping judges pressure-test or “steel-man” their reasoning and identify potential blind spots. By presenting multiple perspectives on legal questions and surfacing counterarguments, AI systems can promote more thorough and balanced decision-making.
Third, for routine constitutional questions with relatively clear answers under existing precedent, AI might eventually help courts process cases more efficiently. But this use case requires careful oversight and robust frameworks for determining which cases qualify as “routine.”
To use LLMs responsibly for any of these purposes, judges and lawyers need to develop “AI literacy.” At a bare minimum, they should understand that:
- Different AI models may reach different conclusions on the same legal question based on variations in their training data and technical architecture.
- AI responses are highly sensitive to how questions are framed and what additional context is provided.
- The same AI model may give different answers to the same question due to randomness built into these systems.
- AI systems make numerous implicit interpretive choices that may not be obvious to users.
- For difficult and high-stakes questions, it is crucial to test AI responses by posing questions in multiple ways and considering counterarguments.
Perhaps most important, judges need to recognize that delegating constitutional decisions to AI does not eliminate the need for moral and political judgment. It merely shifts those judgments to different stages of the process, like choosing which AI system to use, how to frame constitutional questions, and what interpretive instructions, if any, to give. When it comes to resolving fundamental questions about constitutional meaning and the allocation of government power, there is simply no escaping the burdens of human judgment.
This limitation is not peculiar to AI—it reflects the inherent nature of constitutional interpretation itself. Many constitutional disputes involve competing values and interests that must be weighed and balanced by human decision-makers. Perhaps judges should avoid making these judgments themselves, deferring to the legislature or following original meaning. But those, too, are consequential choices that require justification. No technological advance, however sophisticated, can transform such questions into purely objective inquiries with demonstrably correct answers.
As AI capabilities continue to advance, the challenge for courts and judges will be developing frameworks for thoughtful integration of these tools while maintaining appropriate human oversight of fundamental value choices. This likely means starting with modest applications focused on enhancing judicial efficiency and analytical thoroughness, while carefully refining best practices before any expansion to more consequential uses.
The future of constitutional interpretation—and judicial decision-making more generally—will likely involve a complex interplay between human and artificial intelligence. The key is learning to harness AI’s impressive capabilities while being clear-eyed about its limitations and the persistent necessity of human judgment in interpreting our foundational legal document.