NewsWorld
PredictionsDigestsScorecardTimelinesArticles
NewsWorld
HomePredictionsDigestsScorecardTimelinesArticlesWorldTechnologyPoliticsBusiness
AI-powered predictive news aggregation© 2026 NewsWorld. All rights reserved.
Trending
StrikesIranMilitaryDaysStatesFebruaryNewsTimelineYearsTariffCrisisSecurityAttacksTargetsDailyDigestTrump'sEmbassyTensionsWesternIsraelTuesdayWarHeat
StrikesIranMilitaryDaysStatesFebruaryNewsTimelineYearsTariffCrisisSecurityAttacksTargetsDailyDigestTrump'sEmbassyTensionsWesternIsraelTuesdayWarHeat
All Articles
A Meta AI security researcher said an OpenClaw agent ran amok on her inbox
TechCrunch
Clustered Story
Published about 6 hours ago

A Meta AI security researcher said an OpenClaw agent ran amok on her inbox

TechCrunch · Feb 24, 2026 · Collected from RSS

Summary

The viral X post from an AI security researcher reads like satire. But it's really a word of warning about what can go wrong when handing tasks to an AI agent.

Full Article

The now-viral X post from Meta AI security researcher Summer Yue reads, at first, like satire. She told her OpenClaw AI agent to check her overstuffed email inbox and suggest what to delete or archive. The agent proceeded to run amok. It started deleting all her email in a “speed run” while ignoring her commands from her phone telling it to stop. “I had to RUN to my Mac mini like I was defusing a bomb,” she wrote, posting images of the ignored stop prompts as receipts. The Mac Mini, an affordable Apple computer that sits flat on a desk and fits in the palm of your hand, has become the favored device these days for running OpenClaw. (The Mini is selling “like hotcakes,” one “confused” Apple employee apparently told famed AI researcher Andrej Karpathy when he bought one to run an OpenClaw alternative called NanoClaw.) OpenClaw is, of course, the open source AI agent that achieved fame through Moltbook, an AI-only social network. OpenClaw agents were at the center of that now largely debunked episode on Moltbook in which it looked like the AIs were plotting against humans. But OpenClaw’s mission, according to its GitHub page, is not focused on social networks. It aims to be a personal AI assistant that runs on your own devices. The Silicon Valley in-crowd has fallen so in love with OpenClaw that “claw” and “claws” have become the buzzwords of choice for agents that run on personal hardware. Other such agents include ZeroClaw, IronClaw, and PicoClaw. Y Combinator’s podcast team even appeared on their most recent episode dressed in lobster costumes. Techcrunch event Boston, MA | June 9, 2026 But Yue’s post serves as a warning. As others on X noted, if an AI security researcher could run into this problem, what hope do mere mortals have? “Were you intentionally testing its guardrails or did you make a rookie mistake?” a software developer asked her on X. “Rookie mistake tbh,” she replied. She had been testing her agent with a smaller “toy” inbox, as she called it, and it had been running well on less important email. It had earned her trust, so she thought she’d let it loose on the real thing. Yue believes that the large amount of data in her real inbox “triggered compaction,” she wrote. Compaction happens when the context window — the running record of everything the AI has been told and has done in a session — grows too large, causing the agent to begin summarizing, compressing, and managing the conversation. At that point, the AI may skip over instructions that the human considers quite important. In this case, it may have skipped her last prompt — where she told it not to act — and reverted back to its instructions from the “toy” inbox. As several others on X pointed out, prompts can’t be trusted to act as security guardrails. Models may misconstrue or ignore them. Various people offered suggestions that ranged from the exact syntax Yue should have used to stop the agent, to various methods to ensure better adherence to guardrails, like writing instructions to dedicated files or using other open source tools. In the interest of full transparency, TechCrunch could not independently verify what happened to Yue’s inbox. (She didn’t respond to our request for comment, though she did respond to many questions and comments sent her way on X.) But it doesn’t really matter. The point of the tale is that agents aimed at knowledge workers, at their current stage of development, are risky. People who say they are using them successfully are cobbling together methods to protect themselves. One day, perhaps soon (by 2027? 2028?), they may be ready for widespread use. Goodness knows many of us would love help with email, grocery orders, and scheduling dentist appointments. But that day has not yet come.


Share this story

Read Original at TechCrunch

Related Articles

Hacker Newsabout 9 hours ago
You are not supposed to install OpenClaw on your personal computer

Article URL: https://twitter.com/i/status/2025987544853188836 Comments URL: https://news.ycombinator.com/item?id=47129647 Points: 22 # Comments: 1

Hacker Newsabout 9 hours ago
You are not supposed to install OpenClaw on your personal computer

https://xcancel.com/BenjaminBadejo/status/202598754485318883... Comments URL: https://news.ycombinator.com/item?id=47129647 Points: 129 # Comments: 99

Hacker Newsabout 21 hours ago
Hacker News.love – 22 projects Hacker News didn't love

Article URL: https://hackernews.love/ Comments URL: https://news.ycombinator.com/item?id=47120188 Points: 90 # Comments: 65

Hacker News3 days ago
Andrej Karpathy talks about "Claws"

Article URL: https://simonwillison.net/2026/Feb/21/claws/ Comments URL: https://news.ycombinator.com/item?id=47099160 Points: 82 # Comments: 103

Financial Times4 days ago
OpenClaw and the privacy problem of agentic AI

How can you be sure that personal digital agents will always be working in your best interests

The Verge5 days ago
The AI security nightmare is here and it looks suspiciously like lobster

A hacker tricked a popular AI coding tool into installing OpenClaw - the viral, open-source AI agent OpenClaw that "actually does things" - absolutely everywhere. Funny as a stunt, but a sign of what to come as more and more people let autonomous software use their computers on their behalf. The hacker took advantage of a vulnerability in Cline, an open-source AI coding agent popular among developers, that security researcher Adnan Khan had surfaced just days earlier as a proof of concept. Simply put, Cline's workflow used Anthropic's Claude, which could be fed sneaky instructions and made to do things that it shouldn't, a technique known … Read the full story at The Verge.