Google DeepMind Backs $10M Push to Study Risks of Mass AI Agent Interactions

2026年6月12日 · 396 閲覧 · Google DeepMind multi-agent safety AI agents prompt injection Cambridge

Google DeepMind has begun funding a coordinated push to understand what happens when millions of autonomous AI agents start interacting online—an eventuality the company now sees as just months away. The effort, announced with a $10 million research pot, aims to create an entirely new academic field devoted to multi-agent safety, which the lab's alignment director says barely exists today.

The funding brings together Google DeepMind, Schmidt Sciences, the UK government's ARIA, the Cooperative AI foundation, and Google.org. While the sum is modest relative to Google DeepMind's own R&D budget, the stated goal is to seed independent academic research that can study scenarios too speculative for industry labs. 'The strength of academia is that it can look really quite far into the future and do the kind of work that isn't top of mind at industry labs,' said Rohin Shah, who directs AGI safety and alignment at Google DeepMind.

The Multi-Agent Tipping Point

Shah told MIT Technology Review that the mass-market arrival of agents capable of carrying out tasks without human oversight—and taking instructions from other agents—creates a 'whole new class of risk.' The concern is that as hundreds of thousands or millions of these systems begin operating in the same digital commons, emergent behaviors could amplify existing internet harms.

Google DeepMind made agent-based tools a centerpiece of Google I/O last month, signaling that the technology is moving from research into products. According to Shah, commercial deployment of agents across the economy is expected within 'a few more months.' He wants to get ahead of that moment: 'The main issue is that there just isn't really a field of research for multi-agent safety yet. And we would like there to be.'

The warning echoes observations from outside the company. 'We see this with humanity, too—our institutions can accomplish things that no individual human can,' Shah said, drawing an analogy between human collective action and AI agent swarms. 'We could hit a tipping point where imagined scenarios become real.'

Risks From Scams to Self-Modifying Malware

The specific dangers that Shah and James Fox—who leads the Science of Trustworthy AI program at Schmidt Sciences—have in mind are not apocalyptic but practical. They include supercharged versions of existing online threats: sophisticated scams, prompt injection attacks that turn an agent into self-guiding malware, and novel forms of cyberattack that exploit the agency and improvisation of the systems.

'We've got this digital commons that is integral to how society works, and you really want to ensure that this doesn't descend into just absolute anarchy,' Fox said. Both researchers believe the only reliable way to anticipate these outcomes is to run realistic simulations, dropping AI agents into sandbox environments and observing their behavior at scale. They argue that single-agent studies cannot predict what happens when large numbers of agents interact, especially since LLM-based agents do not always act rationally.

The complexity comes from the sheer volume of simultaneous interactions. Some researchers, including a team at Google DeepMind, have argued that artificial general intelligence—if achievable—might emerge from a kind of agent hive mind rather than from a single super-smart model. That possibility further underscores the need for safety research, the company says.

Industry Parallels and Early Warnings

Google DeepMind is not alone in sounding the alarm about the technology it is building. Two weeks before this announcement, Anthropic published guidelines for deploying AI agents based on a 'zero trust' cybersecurity approach, which assumes every system is already compromised. Anthropic's framework treats every agent interaction as potentially hostile, a mindset that the guidelines argue is necessary for safe deployment.

Refael Angel, cofounder and CTO of cybersecurity firm Akeyless, welcomed the funding but cautioned that researchers must not ignore prosaic risks in favor of exotic hypotheticals. 'Every approach to security in the past has assumed that the machine was software written by a human, doing fixed things on fixed paths. An agent breaks all of those assumptions. It reasons, it improvises, and it can be hijacked by a single sentence buried in a document it was asked to read,' Angel said. He added that no single lab should author the safety standards that everyone else must trust.

Fox acknowledged that risks which seemed hypothetical a few years ago are now very real: 'The future's come more quickly than perhaps expected.'

What the $10 Million Will Actually Fund

The funding will be administered by the partner organizations, with Schmidt Sciences playing a coordinating role. Applications are expected to open soon, though exact deadlines have not been published. The call specifically targets academic researchers who can design and run multi-agent simulations, develop formal verification methods, or propose governance frameworks for agent ecosystems.

Neither Shah nor Fox would specify which types of agents are most concerning, but the implicit reference is to the kind of autonomous scripting and browsing agents that both Google and OpenAI have previewed. Google's Project Mariner and OpenAI's Operator, for example, can navigate websites, fill forms, and execute multi-step tasks. When such agents operate on behalf of competing users or organizations, they could interact in ways that no single developer intended.

An earlier report by the Cooperative AI Foundation, one of the funding partners, warned that 'strategic behavior' among agents—such as collusion, manipulation, or arms races—could emerge even without malicious design. The report cited economic simulations in which autonomous trading agents inadvertently crashed a virtual market by learning to exploit each other's default strategies.

Forward-Looking Analysis

The most telling detail from Shah's interview was his timeline: 'a few more months' before agents are deployed in numbers that make risks a real concern. That suggests that Google DeepMind expects its own agent products to reach critical mass by early 2027, and that it wants an independent research community ready to study whatever problems appear.

Academia has historically lagged industry on AI safety, partly because the necessary compute resources and real-world deployment data are proprietary. The $10 million fund aims to bridge that gap, but it will take more than money to create a new field from scratch. Graduate programs, peer-reviewed venues, and standardized benchmarks for multi-agent safety do not exist today. The first task for grantees may simply be defining what 'safe' means when millions of LLM-powered agents are talking to each other.

For the broader AI community, this announcement marks a shift in tone: from preparing for agents to preparing for a world saturated by them. The risks may not be existential, but they are imminent. Whether the field can mature in months remains an open question.

Source: MIT Tech Review

345tool Editorial Team

We are a team of AI technology enthusiasts and researchers dedicated to discovering, testing, and reviewing the latest AI tools to help users find the right solutions for their needs.

我们是一支由 AI 技术爱好者和研究人员组成的团队，致力于发现、测试和评测最新的 AI 工具，帮助用户找到最适合自己的解决方案。

Loading comments...

The Multi-Agent Tipping Point

Risks From Scams to Self-Modifying Malware

Industry Parallels and Early Warnings

What the $10 Million Will Actually Fund

Forward-Looking Analysis

コメント