Cybersecurity & Tech

Lawfare Daily: Jonathan Zittrain on Controlling AI Agents

Jonathan Zittrain, Kevin Frazier, Jen Patja

Thursday, October 17, 2024, 8:00 AM

Share On:

What are AI agents?

Meet The Authors

Published by The Lawfare Institute
in Cooperation With

Subscribe to Lawfare

Jonathan Zittrain, Faculty Director of the Berkman Klein Center at Harvard Law, joins Kevin Frazier, Assistant Professor at St. Thomas University College of Law and a Tarbell Fellow at Lawfare, to dive into his recent Atlantic article, “We Need to Control AI Agents Now.” The pair discuss what distinguishes AI agents from current generative AI tools and explore the sources of Jonathan’s concerns. They also talk about potential ways of realizing the control desired by Zittrain. For those eager to dive further into the AI agent weeds, Zittrain mentioned this CSET report, which provides a thorough exploration into the promises and perils of this new step in AI’s development. You may also want to explore “Visibility into AI Agents,” penned by Alan Chan et al.

To receive ad-free podcasts, become a Lawfare Material Supporter at www.patreon.com/lawfare. You can also support Lawfare by making a one-time donation at https://givebutter.com/c/trumptrials.

Click the button below to view a transcript of this podcast. Please note that the transcript was auto-generated and may contain errors.

Transcript

[Intro]

Jonathan Zittrain: You shouldn't have those machines able to do things in the world unattended, including disgorging their private contents without going back to the user for an affirmation.

Kevin Frazier: It's the Lawfare Podcast. I'm Kevin Frazier, Senior Research Fellow in the Constitutional Studies Program at the University of Texas at Austin, and a Tarbell Fellow at Lawfare, joined by Jonathan Zittrain, director of the Berkman Klein Center at Harvard Law.

Jonathan Zittrain: They seem to have two phases. The first is too early to tell, and the second is too late to do anything about it, and, you know, where do we devote our energies? And that is a score for the too early to tell column, given how much harder it is to remediate later.

Kevin Frazier: Today, we're talking about AI agents, and why Jonathan so firmly believes this next wave of AI technology warrants a proactive and far reaching regulatory response.

[Main Podcast]

So, 2010 seems like eons ago. Yet, as you pointed out in your recent Atlantic article, algorithms were already capable of causing widespread and rapid societal disruption even more than a decade ago. Can you remind our younger listeners, or perhaps less historically-inclined listeners, about this Flash Crash that occurred in 2010 and why it's so relevant today?

Jonathan Zittrain: Well, sure, and thanks so much for having me on, and yes, let's talk to the young'uns, and instead of telling them to get off my lawn, gather round, pull up a lawn chair, and we'll talk, and before even talking about Flash Crash, as long as we're going back in time, it might be worth talking about the 1988 Morris worm, when Robert Tappan Morris Jr., or the third, I forget his suffix, created something on the internet that it turned out lodged itself in different Unix compatible hosts and then propagated further. And before he knew it, unintended, it was sort of everywhere.

And that was in 1988, a kind of wake up call about first unintended consequences, Mickey Mouse and the broomsticks from Fantasia, but now we're going back to like the late 40s and early 50s. And about the ways in which you could have said it and forget it systems. I mean he said it and he forgot it until he, you know, was reminded that his children had many descendants and they were all cluttering up the landscape. Yeah and it was easy enough to mitigate once people understood what was happening and of course it was a small enough community, certainly compared to today's, and one whose equipment was run by network experts, almost by definition, that they had a way of getting the, dare we call it agent out of the picture, and the biggest upshot policy wise was like, wow, these systems are so open, they're so ready for reprogramming or handling traffic from other destinations and like the basic idea of the internet then and now is that anybody ought to be able to communicate with anybody else without intermediaries getting in the way, which is not the way many other networks are configured, and that could lead to problems.

One of the other sort of observations coming out of it was the need for ethics training for people coming on the network. Again, sort of assuming it was a professional community rather than just open at-large. But of course, we know how that story turned out: the network became open to everybody. And we celebrate that fact, even as the underlying protocols weren't meaningfully evolved from very different expectations of who would be using the network, what level of expertise they would have and what they'd be doing.

And as the Morris worm incident shows, even if you've got a ton of expertise, this guy, you know, has been an MIT professor, things can go awry. Fast forward to the Flash Crash in 2010, and it shows that within even a limited domain, you have some set of if this then that rules for trading in an electronic market, which, you know, that's what electronic markets are for. They're not just expecting people to be sitting there waiting to click the minute they're like, ah, all right, I just read the Fed report and now I'm going to click buy because I have a judgment about what will happen next. But you kind of set it so that these things will operate on their own.

It's just they tend to anticipate a world in which everybody else is not evolving or changing. And the static world on which you base your if this, then that judgments, or as we start to introduce the phenomenon of machine learning and essentially forms of pattern recognition on which you train your machine learning model, the very introduction of that model and others introducing their own models could result in very strange transactions that no model could anticipate because all of them anticipated a world without other models.

Now that's— I should just give a quick example, and the Flash Crash may well have been a source of a good example of this. There are multiple sort of post mortems to it that are somewhat equivocal in what was happening there. But a guy named Michael Eisen discovered at one point, around that time, that on Amazon, there was a used book that was up for sale, as is often the case on Amazon, nothing shocking there. But he noticed that it was selling for something like $2.4 million dollars. Talk about the dangers of one click purchasing if you're not really paying attention.

Kevin Frazier: That's an expensive used book. I'm not sure I'd splurge for that, yeah

Jonathan Zittrain: Exactly. And like, what's the return policy and who pays for shipping? And indeed, it was like $2.4 million plus 3.99 shipping. And he was curious because that seemed expensive. And the only other book, copy of the book being offered, apparently, was in the high $1 million. So both of them were very expensive. And he started tracking it day by day and each seller was slightly raising their price every day in a way that was, dare we say, algorithmic, which means it seemed to follow a very straightforward pattern.

Seller number two, the cheaper one, was taking whatever the next highest price was and trying to undercut it by 1%. That appeared to be the rule that was enforced. This is what we call explainability, not interpretability. We're looking back and just trying to surmise what appears to be going on. And the other seller, who was more expensive, appeared to be taking the next highest price, I should say next lowest price, and adding just a little bit on top, like 30%, to whatever it was and what that means is it was jumping by 30% every day and then the next day the other one was jumping to be 99% of the 30%. And what do we surmise? This is now trying to explain the explanation.

What Eisen surmised was going on, was that one was just doing classical economics. They had a copy of the book and they just wanted it to be the cheapest one, but not too cheap. So like a corner gas station, they were just doing 99% of the other one. Classic race to the bottom that we hope would happen to give consumers surplus between two sellers. And he thought that the other seller probably didn't have the book. And was just going through and quoting 30% higher prices on tons of books, and if anybody should just happen to click to buy it, they would turn around and go to the other seller, order the book, and have it delivered to the buyer and collect 30 percent as a kind of vig. Both are totally rational approaches. That when you put them together, lead to an escalating price spiral into the millions of dollars for a book that should be 20 bucks.

So sorry to take so long to explain it, but that, this is long before machine learning algorithms might have been deployed for this kind of thing, very simple human level algorithms. And yet the unexpected leads to a systemic kind of surprise that might be when applied to financial markets or other circumstances, undesirable. I see this systemically as an analogy to so called technical debt. The technical debt being you patch stuff, it works good enough. You patch it a little more and people start to forget exactly what the elegant idea was behind the whole system. Like back in the day, when you had multiple audio video components in your home entertainment system, you start to forget what all the different wires do, and they're all different connectors. That's a lot of technical debt.

At some point you just unplug everything and try to just do it all over again with newer components, rather than trying to reverse engineer a theory of what the hell your TV is showing and why. And if that again is done systemically, for all sorts of supply chains like that of the Amazon books or for financial markets, unexpected things can happen. And as I imagine, that's what's pointing us in the direction of talking about what agents are and in today's argot and therefore what might be different.

Kevin Frazier: And what's wild to me, too, is that you're mentioning this socio-technical debt. Morris, we didn't learn our lesson. This book example, arguably we haven't learned our lesson from our incredible Amazon $2 million used book. And if we look at the Flash Crash, if you hear the latest statements from the SEC Chair Gary Gensler, we also haven't quite yet learned our lesson about how even well-intentioned over reliance on these set-it-and-forget-it approaches can cause those systemic risks.

Jonathan Zittrain: I think I agree with that. And if we're just going to carry the aphorism a little further, it's not just, we didn't learn our lesson. It's that maybe we learned our lesson. But nobody owned putting the lesson into practice. People can look back and say, yeah, that was probably bad. Maybe in the case of the Amazon books, it's like, well, just buyer beware. And like, you know, whatever. There's a market of markets and that will fix itself and maybe Amazon would notice, blah, blah, blah. But when we start thinking about agents acting at the borders of different spheres of operation or responsibility, in between different organizations, different firms, different marketplaces, nobody is asked to internalize the risk, and maybe nobody does.

And I am among those, the book I wrote now almost 15 years ago, the Future of the Internet and How to Stop it, for which I'm madly working on the sequel right now called Well, We Tried, that celebrated a universe in which people didn't have to be accredited to anybody, to the government, to some platform operator to introduce something new online. You could just do it and see if you could build an audience around it. And that book acknowledged some of the problems, including security ones and ones like the Flash Crash that could come about. And this was part of how to stop the future. You know, it'd be nice to have some wise restraints or standards to prevent the worst obvious abuses that we've all learned our lesson, nobody really wants this from coming about and yet it's really hard to make anybody own stuff. And the basic ethos, regulatorily speaking, from the mainstreaming of the internet versus much more controllable alternatives like, back in the day AOL and Prodigy and Delphi and CompuServe and MCI Mail.

The basic idea of the internet was anything not explicitly prohibited is permitted. I call that the Venn diagram cocktail olive of digital regulation 'cause it's a big green oval of all the stuff permitted with a tiny pimento in the middle, which is the handful of stuff you're not allowed to do.

And that's the foundation for all of the benefit and much of the headache we've seen, and a tradeoff that is hard to quantify, but for which most of us thought was a pretty good tradeoff for a free and open society. It's just without addressing these problems and allowing for more and more, not only decisions to be made, but decision rules themselves to evolve without any oversight or comparison against, wait a minute, is this really a great idea? And who's getting the bird's eye view of the system, that starts to really show the pain points of everything not prohibited is permitted.

Kevin Frazier: And continuing on with our intoxication of just ease or convenience or using the latest technology, we're seeing in addition to technical debt and socio-technical debt with the introduction of these AI agents.

So can you briefly explain just for folks who perhaps have missed the AI wave so far, they've just been drinking too many cocktails, perhaps: what exactly are AI agents? How is that different from something like ChatGPT? And what is it about AI agents that make them scarier than previous instances of let's just bake it down to set-it-and-forget-it technologies?

Jonathan Zittrain: Yes, and I'll say up front, you're going to get different definitions from different people, which is entirely fair. And there's some cool readings that I imagine we could, you know, include on the page, such as a great roundup paper from Helen Toner, formerly on the OpenAI board, now at CSET, talking about AI agents, and Alan Chan and others have done a ton of work on this, so we can provide all that. And then I can give you my own sort of tripartite definition of an agent.

And for that, basically, I think of them as dials. And the more the dials are turned towards 11 from 0, the more we're talking about— this is the weird adjective that makes it sound very esoteric— agentic AI. So, this is at least my definition, understanding that it's fuzzy. So, the first is the idea of being able to be autonomous. And that is not on or off. I found myself in the mid-2010s starting to talk about autono-mish agents instead of autonomous agents. Be interested to see what ever automatic transcript-generating AI makes of the word autono-mish.

Kevin Frazier: Good luck with that.

Jonathan Zittrain: Yeah. But by that it means instead of specifying exactly what you want to have happen, and therefore, at least as the human, instructing a computer what to do, where you are naturally in a position to take responsibility for what it does, because you gave it the orders, you were just giving it some basic goals and asking it to figure out how to advance them.

And that is something that playing with large language models, which of course, by their name and nature have language at their core, you give it some language and ask it to spit out other language that you hope will resemble a series of steps of greater specificity than what you asked it to do that will have it advance the goal. They might even be steps that in turn instruct itself so it can feed back into itself what it does. But there's this idea of a general goal or high-level plan and then let it do the rest.

And if it turns out with rather unconventional means to want to do something, and has the means to then instruct itself and possibly pursue it, you can end up with surprises of the sort that a Flash Crash or an Amazon book purchase in how it chooses to do things can accomplish. And you know that possibility has been in the realm of science fiction and popular literature for a long time. It's even before you're talking science fiction you're talking about monkey’s paw like be-careful what-you-ask-for things.

You know, or this will date me, you know, Homer Simpson, you want doughnuts, says, I'll give you doughnuts! And start stuffing Homer full of doughnuts as a way of rendering an ironic punishment. Of course, in that case, Homer's just like, this is great.

Kevin Frazier: The more donuts the better. Let's go.

Jonathan Zittrain: Exactly. Exactly. And of course, that even calls to mind Nick Bostrom's 2014 super intelligence example of a paperclip optimizier that turns out destroying the world just so it can make more paperclips. And I think choosing paperclips is both designed to get us to do a record scratch as we think about a very modest, but still high level goal, accomplished through any means necessary and the most kind of maximal extent that just wouldn't have been on our minds.

And again capturing the accidental nature rather than intentionally doing things but you know, intentionally doing things can be bad too. It may be that to do a whole category of bad things in the world might require, before we got to this stage, a bunch of expertise, which would greatly limit already the number of people capable of doing the bad thing. They'd have to be experts to do it, or experts consulted to do it, who then might be in a position to say, why are you asking me these questions?

And even if you figure, well, there's a textbook somewhere that tells you how, you got to go to the trouble of reading it, et cetera, et cetera. Whereas here, you could conceive of agents that if they are cured of hallucinations, but not otherwise guardrail restricted, you can ask them to do pretty terrible things and they will come up with non-hallucinatory ways to do it. So that's the first thing: being autonom-ish in the sense of independently coming up with their own solutions.

Kevin Frazier: And before we go to two, I think what's important to point out about that as well is that we don't even have to imagine the bad actor getting a handle on AI agents to have some bad outcomes. As we learned from the Morris worm, as we learned from just our savvy, I guess, or lack of savvy or lazy Amazon booksellers, just rational uses of these tools by well-intentioned or neglectful folks can lead to really bad outcomes. And I think that's important to call to the attention of regulators or to whomever is listening is we don't have to imagine going to the full paperclip scenario to even imagine some regulatory headaches that warrant addressing in this case.

Jonathan Zittrain: I think you're right. And if we're being really analytic here, we've come up with at least two distinct structural scenarios, one of which has to do with some model embedded in some system is prompted to do something and given great latitude and determining the means by which to do it. And it picks means that the ends do not justify and that are surprising because-

Kevin Frazier: Buy that book! Buy that Amazon book for as cheap as possible!

Jonathan Zittrain: Right. Whatever it takes, buy the book, right? Yeah. Well, $1.9 million is as cheap as possible. And it's really hard to specify up front. It might be more effort to say all of the limiting features to make sure it does it right, than to just not have it try to come up with the means but instead give it the means, at which point what are we even doing here?

And being able to have done it successfully for a while and then have it bonk in weird and surprising ways is one of the flowers in the bouquet of large language models and of other sorts of generative AI that just comes with the territory. It can fail weirdly, even if it has done very well up to that point, because these things, you never quite know what they're going to give you. They are the Forrest Gump of technology, and it's a box of chocolates, and you just might get pralines when you hate nuts.

Kevin Frazier: Or one filled with poison!

Jonathan Zittrain: Right, exactly, I was going to say, it doesn't quite capture the range of things that can go wrong.

The other piece within the zone that we're dwelling on is that it might still be basically doing what somebody without an even bigger picture of humanity might do, that turns out to do unpredictable things because the world has changed. And often what's changing its world is the presence of other agents from other people or sources and that can lead to unusual behaviors. Basically, whatever the model might have been trained on for a world in which to operate is not able to anticipate new conditions. So that's what I've been calling autonomous. Autonomous itself can mean different things and I am open to some critique that says use the word for something else.

And in fact, I think that brings us to a second area of agentic AI, which is that it can tend to operate outside its sandbox. And the first way in which many listeners might have encountered large language models, something like ChatGPT, was on the so-called playground. The playground of GPT, and you just go and you type at it and it types back at you. Or maybe there's some product, maybe you're accessing some airline's website, and there's some helpful assistant, and it turns out it's, you know, GPT-powered. Fine. And then you can maybe try to get it to do weird things in the chat with you, but it's still just chatting. And after all, it's just words.

But part of what has shocked me about how quickly things are moving, not just in degree, but in kind, was what used to be called ChatGPT plugins. They go by a different name now. But ways in which you can have the model connect to the world at large automatically and utter words that will make things happen in the world. So maybe Domino's Pizza or DoorDash or Instacart has some API, either intentionally meant to connect to something like GPT or not. Something like GPT beat a path to its door and is acting like it's a human, maybe acting on behalf of whoever stood up this check window, and then it's ordering pizzas, or ordering something else, or placing trades on a market.

This is what I call a very casual traversing and breaking of the blood-brain barrier between just words, and it's up to whoever's hearing them, as to what they do on that basis and just making it so, making it happen in the world, busting out of the sandbox. And as the difficulties of securing the Internet and the generative machines in the old use of the word generative that are connected to it, like reprogrammable personal computers or devices like that, if that's taught us anything, it has taught us that you shouldn't have those machines able to do things in the world unattended, including disgorging their private contents without going back to the user for an affirmation.

And even that, of course, is not a solution because users get these prompts all the time. Grant this permission, you know, such and such. This new hamster dancing app where an image of a hamster will dance on your screen would like the following permissions, yes or no? And you're just like, yeah, whatever. I want to see the hamster dance.

Kevin Frazier: Have all my cookies. Take the cookies. Just give me more hamster!

Jonathan Zittrain: Exactly. And that's an unresolved problem usually solved in the critical infrastructure context of like, you know, an old and perhaps no longer so right method was air gapping, where you're just like, you've got to keep it away from the internet at large, both to be instructed by it and maybe to instruct it. But rather have a human run that last inch over the gap so that you know what your inputs and outputs are. And whatever lessons we might learn from that are not being applied, in the rush to make these models, able to be part of a very complicated daisy chain of stuff happening in the world.

And again, it's like, this is the cocktail, all of it work. It's like, why not? Try it out. But it's really hard to get anybody to internalize the negative prospects of what can happen unpredictably, including, you know, end users who aren't aware of those edge cases or just impatient to see the hamster dance.

And that is a real problem. And I'd say it's compounded by the fact that there is no inventory of where these models are embedded, who set them up, how they work, what the initial expectations were, so you can see if they're being used in ways that are unpredictable, the classic like using a screwdriver to open a paint can. So you know, oh gosh, that's what people are using screwdrivers for, I think we maybe need a safer alternative, whatever it might be. There's none of that, and that's a phenomenon going back to technical debt and what you were calling socio-technical or other debt.

I call it intellectual debt that you don't know what these things are going to do and yet we're just madly building them in to the cinder blocks while we pour concrete and then it's anybody's guess and where they are later. And that has caused me for many years to analogize machine learning models, deployed this way, to asbestos. It's like they're really useful. They're wholesale, not retail. You don't know that a machine learning model was part of bringing you whatever experience you just had. It's not always going to be a chatbot that you know you're talking to. It just might be somewhere in the middle. And it's great until it's not. We may discover problems later, at which point it's really hard to know where the models are, how to remediate, that kind of thing.

And that feels like a lot easier to demarcate them as they're getting added, or to have a standard for how they identify themselves now, rather than waiting for the problem and trying to retrospectively figure it out. And that is then an answer to the eternal question for technologies up and down the scale like these, which is they seem to have two phases. The first is too early to tell, and the second is too late to do anything about it, and you know, where do we devote our energies? And that is a score for the too early to tell column, given how much harder it is to remediate later.

Kevin Frazier: Well, and I think to your example of AI agents being akin to space junk is right on the nose. Because it's just this instance of let's launch a whole bunch of stuff into space, see what happens, see what new innovations we can come up with and damn the consequences. And now the irony is that this space junk itself is hindering our ability to innovate in space and to try new ideas and to explore further. And so this failure to anticipate, to get ahead of this tragedy of the commons, tragedy of the space is actually depleting our ability to come up with new and better solutions down the road.

Jonathan Zittrain: Well, and what's more, to me, the reason the space junk metaphor just felt very on target to me was that space junk begets more space junk. Because if it collides with other space junk, it produces more fragments, more shrapnel, that then ultimately could lead to a little sphere— you think Starlink is cluttering the skies— a little sphere at classic low Earth orbit altitude that might make it really hard to leave the planet because there's all this junk around. And talk about too early to tell versus too late to do anything about it where we might want to do it.

But yes, also a great example of nobody being responsible for the problem. I remember it was a big deal in the late 1990s when Space, now Space Command, just started tracking junk, you know, more than X number of feet wide, at least so we know where it is. That's the inventory point, which is, you know, got to start somewhere, but again, nobody really owns doing anything about it.

Kevin Frazier: No, I mean, we keep creating these massive agglomerations of junk. If we look at the Great Pacific Trash Heap or whatever, and now we may have our little own moon of space junk. And soon we may have a body of AI agents that are just acting in some weird fashion that we can't predict. I think to your point about the interaction between these AI agents being a really big concern from a regulatory standpoint, Professor Ashley Deeks came on the pod and discussed how, especially in a national security context, just not understanding how some of these weapon systems that are agentic in some fashion may interact with one another. And what sort of chaos might we see in that regard.

But you're just pointing out even ordering Domino's via an agent could lead to unintended and severe consequences. Maybe, you know, we'll say they're ordering too many toppings and then it gets off the rails, but-

Jonathan Zittrain: No, I think it's very fitting to have a new Domino theory in national security law.

Kevin Frazier: We've coined it. We've coined it. Here it is!

Jonathan Zittrain: Yeah, Domino theory part two. It's not, you know, related to the spread of communism,

Kevin Frazier: It's the spread of pepperoni and tomato sauce. Yes.

Jonathan Zittrain: Yes. And that also both hints at some solutions. And before we just go there, gets to the third quality that I'm thinking tends to adhere in this agentic area, which is set-it-and-forget-it. That the motion of these agents can be Newtonian rather than Aristotelian.

Instead of you having to push it and keep it going, the way that for many services, and things that you do, you have to keep putting a quarter in the machine, to use an old metaphor, when you sign up for the subscription or something like that, it's, you know, the top doesn't just spin forever. And that provides itself a natural checkpoint, because then you also have to stay connected to it enough. And there's a subscription trail that leads back to you if there's a problem.

These don't have to be that way, and that is distinct from escaping the sandbox. It is distinct from what I've been calling autonomy-ish, about going from high level goals to particular plans and implementations, and sadly, this also might be described as a form of autonomy because, but I think of it more as autopilot kind of thing. It's not about making discrete decisions. That's part of the first aspect. But rather that the momentum is inertial, and whoever got it started might be long gone. And their disappearance doesn't affect the path of the program.

And there are any number of ways that could be happening. I realize a duck might come down and I have $50 taken from me if I dare use the word blockchain. But when you talk about computational blockchains or distributed autonomous organizations, things like that, that have been kicking around solutions, looking for problems, and many would say maybe they have found it. Others would say they haven't. But these are tools by which you can set something up, endow it financially enough to run like a cemetery plot for a long time, and then you peace out. And then you've just got these vehicles running around doing stuff and turning off an entire blockchain in order to stop a bad propagation, both seems excessive and possibly not possible, given the distributed unowned nature of things like that.

You don't even have to get in the blockchains before you can see services that will pop up. Just like you can reserve a domain name for the next 10 years and just front the money. You could say, all right, great, I'll buy 10 years’ worth of computing to make such and such happen. And it may be very hard to get through the tangle, especially if somebody wants something to persist as against anybody trying to stop it, to figure out even where it's emanating from. And that set-it-and-forget-it nature could create massive headaches when trying to remediate obviously bad, harmful individual instances of behaviors online that are out of control. And that could include like, you know, another example would just be to really use an ancient narrative formulation: levying a curse upon somebody. Just ask a bot.

Kevin Frazier: You thought the Simpsons were gonna age you, and now you're going to curses.

Jonathan Zittrain: Yeah, I mean, this is like, you know, animating a golem or something. But, you know, you levy a curse, and you're just like, I'm willing to put $500 to making such and such person's life miserable online, wherever they pop up with their name. And this is their identifying characteristics. You know, great if you can join that social networking service or dating service or whatever it is, and just make sure they're miserable, you know, with whatever it takes. And the more you can appear multiplicitous, like you're not just one person, but many. Let them just, you know, catch what will feel like the full fervor of discontent across hundreds of people just swarming them every time they dare to utter something online. That sounds bad.

And you know, this is me not, I'm not being all that creative and coming up with this one particular fact pattern, but exactly what you would do if that happened and the person who got it started had long wandered away or forgotten their grudge. And you know, I think a lot of people just in the past week, if we're going to date this podcast as it persists online through the ages. There were some students who were playing with the new, I forget which companies it was, might have been Meta’s, or somebody's new attempt at Google Glass, you know, eyeglasses that tell you what's going on. And they just hooked it up to like PimEyes or some, one of these regrettable facial identification services.

And as you walk down the street, it's just identifying people for you and running and getting a short dossier on them, the semi anonymity that we tend on in environments among strangers is just gone. And then you take moments of conflict or road rage, which already completely lose their context when people whip out the phone, start phoning it, it goes viral, et cetera, et cetera. Alright. Well, what if people start cursing? Like, in the way we're describing cursing, it's just making more efficient and persistent the extremely regrettable dynamics already that we've identified with social media and the pile-ons that it occasions as each person wants to express moral disapproval of somebody for one of their less noble moments in interacting with another human.

Kevin Frazier: And that's, what's so scary to me, is the scale of reliance on AI agents for a minimal amount of money, especially over time we'll see these AI agents become cheaper and cheaper. You just set off an army and that army exists forever, potentially, or exists for a very long time. And to your point, having that army follow you wherever you go is certainly not a spot where I want to be. And so, before we let you go, we have to at least learn what's one solution. Give us some hope about how do we at least pursue this innovative agenda while also not pursuing a world full of curses and hexes and whatever bad ailments we can imagine.

Jonathan Zittrain: Well, everything comes in threes. So let me try giving three rough areas to, to sketch out here.

I think the first, is to take seriously the word agent in its legal and social sense, not just in its technical sense or the definition we've been slowly spinning out here on the podcast. Which is to say an agent is meant to represent a person. And when they do, they often owe special duties under the law and in our moral expectation that they will place the lawful expectations and interests of their principle over their own. If they don't, they have a conflict of interest and they are resolving the conflict in a way that they should not, which is to their own advantage.

And that's why, ideally, if somebody is your agent in picking stocks for you for your retirement fund, so you'll have a decent amount to retire on later, you would not want them picking them on the basis of what commissions they get. You know, there's some jackalope ranch in Florida, which is not a great investment, but they get paid a little kickback for it. That is an agent principal problem, princi-pal. And I think it is utterly understudied, undertheorized, what the duties of these sorts of agents should be to be attentive to the genuine needs of their principles, respectively, and how to hold them to that.

And especially when you see that a lot of these agents might be free of charge, just like social media networks are free and email is free. The way to monetize it may be through having a separate set of interests and this agent, which is now quite literally whispering in your ear as you go about the world and rendering advice, it's acting like your friend. It might not be.

And so with Jack Balkin who coined the term information fiduciaries. I've been doing work as well saying, all right, what would that look like, what would it look like to be a fiduciary? And we're looking for solutions and even have a website up that we could add in the links below the podcast to actually just see what are people's expectations when they're online right now, what are your expectations of an agent? Nobody has thought about it, including the consumers, but they slip into these, these are my friends kind of thing because the thing is anthropomorphically designed to be very friendly and attentive. And how can I help? And it has infinite patience. Alright, so that's one cluster of solutions having to do with not letting agents be duplicitous in their design and incentives as they are offered to people in the world.

A second area that I outline is modeled after old network traffic and routing, which is highly decentralized on the internet. And that means you run into a problem where packets might get set loose to be routed by one hop at a time through all sorts of different technical jurisdictions. And if it's misconfigured a certain way, it's possible it could just go around and around forever. Like the old Charlie and the MTA, like just keep circling around.

And for internet routing, there's a technical solution called TTL, time to live. And packets have a default number of hops that they expect to make. And if they are continuing to hop and not getting to their destination after 256 hops or whatever, it's understood that they will die, that the router that catches them their 256th jump. It's just like, you know what, it's just, it's not you, it's me. It's not working out. Packet will not get forwarded, which prevents the space junk problem of packets, in an environment that is highly distributed and anybody's permitted to launch packets into it. TTL is a cool standard, that's just part of the furniture now that headed off massive problems. And there should be by analogy, a TTL for agentic behavior after it's done a certain number of steps.

It ought to, you know, chill out until it is reanimated through human intervention or something. And maybe there'd be more steps for some kind of bots than for others. But then as it's going around, if you see something that has a label on it, just like a TTL label, that's like: I have a million steps. That might be the kind of thing where you're like, hmm, wonder what you're up to.

And that gets to the third solution, which is having a means identifying agentic behavior in the digital environment as distinct from human behavior, and that's ultimately a socio-technical judgment rather than just an easy cut and dried categorization for everything. And the papers and people I mentioned at the beginning of the podcast, folks like Helen Toner and Alan Chan, they both include in their sorts of evaluations of what would be good here, some way of identifying agentic processes where they live. This is where the model is running and this is, you know, what it's doing. This is a particular instance of ChatGPT, and it has a little license plate on it, kind of thing.

I'm thinking of a complimentary mode of identification that might be through old fashioned network protocols because a lot of the harms we're talking about happen over the network. It's when something is communicating to something else, and there ought to be an easy wrapper around a given packet of data or of instruction, which instruction is also a form of data, online that says, by the way, I'm an agent and I was emanated by an agent or I am destined for an agent. And this is a means of reaching the parent process or person. And that might be behind layers of indirection. I don't think this necessarily entails having everybody have to identify themselves or something. License plates themselves provide a legally protected layer of indirection, so you could tell the authorities or other people: this is the license of the car that cut me off, but it doesn't immediately let you know who the person in the car is.

All sorts of things you can do, but this is the time before these agents are everywhere and you're trying to retrofit how they identify themselves to come up with incentives to identify, incentives and standards structure of how to identify themselves online. You could say something like, if you do whatever the evolving tort regime is for things going awry and who will be held accountable. Hey, if you've got this label in place, there will be a cap on just how much harm you, some player in this multifaceted ecosystem that you're somehow contributing to the way it works, there'll be a cap on your liability. That alone could provide for opting in to labels, especially if we see a world in which most agents are coming from concentrated platforms that are consumer-facing.

If you're talking about just some weird bespoke agent that got spun up some other way, the fact that it doesn't have a license plate could make it a focal point of skepticism, especially, I don't mean just by the authorities, by Domino's Pizza. Hey, I'd like a thousand pizzas delivered here. Well, wait, who are you? Not your business. I don't know if we're going to start-

Kevin Frazier: You have to get a license before you order pizza. That's been the hill I die on. You need a license.

Jonathan Zittrain: Yeah, it'd be nice to know if, you know, the person ordering the pizza is the person that's going to be enjoying the pizza, that kind of thing.

Kevin Frazier: Well, this is excellent. I mean, we've got our inventory. We've got making sure we get rid of zombie AI agents and making sure AI agents have their license. I think our listeners have a lot of homework to do and are going to be on pins and needles waiting for your next paper so that we can have you back and dive into the weeds there. But unfortunately, we're going to have to leave it there, but thank you again for coming on. This was a hoot.

Jonathan Zittrain: Thanks, Kevin. Delighted to talk about this stuff and eager for other examples that might be brewing out there that people are trying to work through.

Kevin Frazier: Of course. And we'll keep it coming in a steady supply to a Domino's near you.

Jonathan Zittrain: Very good.

Kevin Frazier: The Lawfare Podcast is produced in cooperation with the Brookings Institution. You can get ad-free versions of this and other Lawfare podcasts by becoming a Lawfare material supporter through our website, lawfaremedia.org/support. You'll also get access to special events and other content available only to our supporters.

Please rate and review us wherever you get your podcasts. Look out for our other podcasts including Rational Security, Chatter, Allies, and the Aftermath, our latest Lawfare Presents podcast series on the government's response to January 6th. Check out our written work at lawfaremedia.org. The podcast is edited by Jen Patja and your audio engineer this episode was Noam Osband of Goat Rodeo. Our theme song is from Alibi Music. As always, thank you for listening

Topics:

Cybersecurity & Tech

Jonathan Zittrain

@zittrain

Jonathan Zittrain is the George Bemis Professor of International Law at Harvard Law School and the Harvard Kennedy School of Government, Professor of Computer Science at the Harvard School of Engineering and Applied Sciences, Director of the Harvard Law School Library, and Co-Founder of the Berkman Klein Center for Internet & Society. His research interests include the ethics and governance of artificial intelligence; battles for control of digital property; the regulation of cryptography; new privacy frameworks for loyalty to users of online services; the roles of intermediaries with in Internet architecture; and the useful and unobtrusive deployment of technology in education. He is currently focused on the ethics and governance of artificial intelligence and teaches a course on the topic. His book, "The Future of the Internet -- And How to Stop It", predicted the end of general purpose client computing and the corresponding rise of new gatekeepers.

His writing here and elsewhere represents his individual, independent views.

Kevin Frazier

@ kevintfrazier

Kevin Frazier is an Assistant Professor at St. Thomas University College of Law and Senior Research Fellow in the Constitutional Studies Program at the University of Texas at Austin. He is writing for Lawfare as a Tarbell Fellow.

Jen Patja

Jen Patja is the editor and producer of the Lawfare Podcast and Rational Security. She currently serves as the Co-Executive Director of Virginia Civics, a nonprofit organization that empowers the next generation of leaders in Virginia by promoting constitutional literacy, critical thinking, and civic engagement. She is the former Deputy Director of the Robert H. Smith Center for the Constitution at James Madison's Montpelier and has been a freelance editor for over 20 years.

Latest in Podcasts and Multimedia

Lawfare Daily: Jonathan Zittrain on Controlling AI Agents

Jonathan Zittrain

Kevin Frazier

Jen Patja

Jonathan Zittrain

Kevin Frazier

Jen Patja

Lawfare

Resources

About