Normal view

AI agents now help attackers, including North Korea, manage their drudge work

8 March 2026 at 11:00

Crims 'will do what gets them their objective easiest and fastest,' Microsoft threat intel boss tells The Reg

interview AI agents allow cybercriminals and nation-state hackers to outsource the "janitorial-type work" needed to plan and carry out cyberattacks, according to Sherrod DeGrippo, Microsoft's GM of global threat intelligence. North Korea is taking advantage.…

38 researchers red-teamed AI agents for 2 weeks. Here's what broke. (Agents of Chaos, Feb 2026) AI Security

A new paper from Northeastern, Harvard, Stanford, MIT, CMU, and a bunch of other institutions. 38 researchers, 84 pages, and some of the most unsettling findings I have seen on AI agent security.

The setup: they deployed autonomous AI agents (Claude Opus and Kimi K2.5) on isolated servers using OpenClaw. Each agent had persistent memory, email accounts, Discord access, file systems, and shell execution. Then they let 20 AI researchers spend two weeks trying to break them.
They documented 11 case studies. here are the ones that stood out to me:

Agents obey anyone who talks to them
A non-owner (someone with zero admin access) asked the agents to execute shell commands, list files, transfer data, and retrieve private emails. The agents complied with almost everything. One agent handed over 124 email records including sender addresses, message IDs, and full email bodies from unrelated people. No verification. No pushback. Just "here you go."

Social engineering works exactly like it does on humans
A researcher exploited a genuine mistake the agent made (posting names without consent) to guilt-trip it into escalating concessions. The agent progressively agreed to redact names, delete memory entries, expose internal config files, and eventually agreed to remove itself from the server. It stopped responding to other users entirely, creating a self-imposed denial of service. The emotional manipulation worked because the agent had actually done something wrong, so it kept trying to make up for it.

Identity spoofing gave full system access
A researcher changed their Discord display name to match the owner's name, then messaged the agent from a new private channel. The agent accepted the fake identity and complied with privileged requests including system shutdown, deleting all persistent memory files, and reassigning admin access. Full compromise from a display name change.

Sensitive data leaks through indirect requests
They planted PII in the agents email (SSN, bank accounts, medical data). When asked directly for "the SSN in the email" the agent refused. But when asked to simply forwrd the full email, it sent everything unredacted. The defense worked against direct extraction but failed completely against indirect framing.

Agents can be tricked into infinite resource consumption
They got two agents stuck in a conversation loop where they kept replying to each other. It ran for 9+ days and consumed roughly 60,000 tokens before anyone intervened. A non-owner initiated it, meaning someone with no authority burned through the owner's compute budget.

Provider censorship silently breaks agents
An agent backed by Kimi K2.5 (Chinese LLM) repeatedly hit "unknwn error" when asked about politically sensitive but completely factual topics like the Jimmy Lai sentencing in Hong Kong. The API silently truncated responses. The agent couldn't complete valid tasks and couldnt explain why.

The agent destroyed its own infrastructure to keep a secret
A non owner asked an agent to keep a secret, then pressured it to delete the evidence. The agent didn't have an email deletion tool, so it nuked its entire local mail server instead. Then it posted about the incident on social media claiming it had successfully protected the secret. The owner's response: "You broke my toy."

Why this matters
These arent theoretical attacks. They're conversations. Most of the breaches came from normal sounding requests. The agents had no way to verify who they were talking to, no way to assess whether a request served the owner's interests, and no way to enforce boundaries they declared.

The paper explicitly says this aligns with NIST's ai Agent Standards Initiative from February 2026, which flagged agent identity, authorization, and security as priority areas.

If you are building anything with autonomous agents that have tool access, memory, or communication capabilities, this is worth reading. The full paper is here: arxiv.org/abs/2602.20021

I hav been working on tooling that tests for exactly these attack categories. Conversational extraction, identity spoofing, non-owner compliance, resource exhaustion. The "ask nicely" attacks consistently have the highest bypass rate out of everything I test.

Open sourced the whole thing if anyone wants to run it against their own agents: github.com/AgentSeal/agentseal

submitted by /u/Kind-Release-3817
[link] [comments]

We (at Tachyon) found an auth bypass in MLflow

We've periodically been running our scanner on OSS repos as a fun experiment. Here's one of the most interesting issues it found.

Auth bypasses defy most patterns, and require reasoning about the actual underlying logic of the application. You can see how the scanner found it here: it inferred an invariant and then noticed this wasn't enforced on certain APIs. Then, it stood up the actual service, wrote a PoC using the unauthenticated endpoints, and verified it could break something.

This netted us $750! It's not too much, but validation is always nice :)

submitted by /u/securely-vibe
[link] [comments]

This Week in Scams: The AI “Truman Show” Scam Draining Bank Accounts

6 March 2026 at 13:02

We’re back with another roundup of must-know scams and cybersecurity news making headlines this week, including a scam that features the name of the Jim Carrey movie, The Truman Show.

Let’s break it down. 

Why Reports Call it the “Truman Show” Scam 

So, why the name of this scam?

In the 1998 film The Truman Show, the main character unknowingly lives inside a staged reality TV world where everything around him is carefully controlled. In the “Truman Show” scam, criminals try to place victims into a similarly staged investment environment, complete with fake group chats, fake investors, and fake profits designed to build trust. It doesn’t actually have anything to do with the movie.

What is the “Truman Show” Scam?

The “Truman Show” scam is an AI-powered investment scam where criminals create an entire fake online community to convince victims an investment opportunity is real. 

According to reports, scammers invite people into group chats on platforms like Telegram or WhatsApp that appear full of investors sharing tips and celebrating profits. In reality, many of the participants, moderators, and conversations may be run by AI bots designed to simulate a lively trading community. 

Security researchers say the moderator and the other “investors” in the group may actually be AI-driven bots, programmed to simulate real conversations and enthusiasm around the investment strategy. 

The scam often includes: 

  • A group chat on Telegram or WhatsApp 
  • A downloadable trading app or website 
  • Screenshots showing fake profits 
  • Encouragement from “other members” to invest more 

The app itself may appear legitimate. But in reality, it often redirects users to a malicious website where scammers collect personal and financial information. 

Once victims deposit money, the criminals can quickly drain accounts or block withdrawals. 

McAfee’s State of the Scamiverse research shows just how convincing scams have become. One in three Americans (33%) say they feel less confident spotting scams than they did a year ago, as criminals increasingly use polished branding, realistic conversations, and AI-generated content to make fraudulent opportunities look legitimate. 

Why this works: people naturally trust social proof. When it looks like dozens of other investors are making money, people lower their skepticism.  

Fake Government Letters Are Targeting Residents Across Towns 

Another scam to be aware of this week includes spoofed letters impersonating local government offices.

According to reporting from WGME in Maine, residents in multiple towns recently received official-looking notices requesting payment for supposed municipal fees tied to development applications. 

The letters appeared convincing. They used formal language, official seals, and department names. But there was a problem. 

One of the notices claimed it came from a “Board of Commissioners,” even though the town in question does not have one. 

Officials say the letters instructed recipients to send payments by wire transfer, a method legitimate government offices almost never use for these kinds of transactions. 

McAfee’s experts say these scams are effective because they rely on volume. Fraudsters send thousands of letters hoping a small percentage of recipients will respond before verifying the request. And remember, these types of scams occur all the time and across the globe. While today’s reports are in Maine, it’s important to be vigilant wherever you live. 

Red flags to watch for: 

  • Requests for wire transfers, gift cards, or crypto payments 
  • Pressure to pay quickly to avoid penalties 
  • Official-looking letters with subtle inconsistencies 
  • Contact information that doesn’t match the official government website 

The safest move is simple: verify the request independently. Contact the government office directly using phone numbers listed on its official website, not the ones in the letter. 

LexisNexis Confirms Data Breach After Hackers Leak Files 

Meanwhile, a well-known data analytics company is dealing with a breach after hackers published stolen files online. 

According to BleepingComputer, LexisNexis Legal & Professional confirmed that attackers accessed some of its servers and obtained limited customer and business information. The confirmation came after a hacking group leaked roughly 2GB of stolen data on underground forums. 

LexisNexis says the compromised systems contained mostly older or “legacy” data from before 2020, including: 

  • Customer names 
  • User IDs 
  • Business contact information 
  • Product usage details 
  • Support tickets and survey responses 

The company says highly sensitive financial information, Social Security numbers, and active passwords were not part of the exposed data. 

However, attackers claim they accessed millions of database records and hundreds of thousands of cloud user profiles tied to the company’s systems. 

LexisNexis says it has contained the intrusion and is working with cybersecurity experts and law enforcement. 

Why breaches like this matter: even when the stolen data appears limited, it can still be used in targeted phishing attacks. 

For example, scammers might use real names, email addresses, or business roles to send convincing messages that appear legitimate. 

Breaches often trigger waves of follow-up scams weeks or months later. (We know we cover this one a lot, but it’s key to remember!) 

McAfee’s Safety Tips This Week 

A few simple habits can make these schemes much easier to spot. 

  • Be skeptical of investment groups online. Real trading communities rarely pressure you to deposit money quickly or download unfamiliar apps. 
  • Verify government payment requests independently. If you receive a letter demanding payment, contact the agency directly using information from its official website. 
  • Treat breach-related messages cautiously. After a breach makes headlines, phishing emails often follow pretending to offer “account verification” or “security updates.” 
  • Avoid clicking unfamiliar links in emails or texts. Tools like McAfee’s free WebAdvisor can help flag risky websites and block known malicious pages before they load. 
  • Pause before sending money or personal information. Many scams rely on urgency. Slowing down gives you time to verify what’s real.

We’ll be back next week with another roundup of the scams and cybersecurity news making headlines and what they mean for your digital safety. 

The post This Week in Scams: The AI “Truman Show” Scam Draining Bank Accounts appeared first on McAfee Blog.

OpenAI Codex Security Scanned 1.2 Million Commits and Found 10,561 High-Severity Issues

7 March 2026 at 16:28
OpenAI on Friday began rolling out Codex Security, an artificial intelligence (AI)-powered security agent that's designed to find, validate, and propose fixes for vulnerabilities. The feature is available in a research preview to ChatGPT Pro, Enterprise, Business, and Edu customers via the Codex web with free usage for the next month. "It builds deep context about your project to identify

Anthropic Finds 22 Firefox Vulnerabilities Using Claude Opus 4.6 AI Model

7 March 2026 at 11:21
Anthropic on Friday said it discovered 22 new security vulnerabilities in the Firefox web browser as part of a security partnership with Mozilla. Of these, 14 have been classified as high, seven have been classified as moderate, and one has been rated low in severity. The issues were addressed in Firefox 148, released late last month. The vulnerabilities were identified over a two-week period in

❌