❌

Normal view

Received β€” 7 May 2026 ⏭ /r/netsec - Information Security News & Discussion

An AI security auditor that red-teams PRs to find exploits, not just patterns (open-source + Ollama support)

Hey everyone,

I’ve been working on an experiment in AI-driven application security called SentinAI. I’m a backend engineer in fintech, and I spent part of my recent leave trying to explore a simple question:

Most SAST tools are basically metal detectors:
they’re great at catching obvious patterns like unsafe functions or missing headers.

But they struggle with the stuff that actually matters in real systems:

  • IDORs
  • authorization drift
  • multi-tenant isolation issues
  • broken middleware assumptions
  • cross-file logic flaws

Attackers don’t think in patterns.

They think in systems.

So I built something experimental to explore that gap.

🧠 The Architecture (3-Agent Loop)

Instead of a single LLM prompt (which tends to hallucinate easily), SentinAI uses a structured multi-agent flow:

1. The Architect

Maps the system:

  • routes
  • auth boundaries
  • data flows
  • trust assumptions

2. The Adversary πŸ₯·

Tries to break it:

  • generates exploit paths
  • builds step-by-step attack chains
  • simulates real-world abuse scenarios

3. The Guardian πŸ›‘οΈ

Validates everything:

  • checks exploits against actual code context
  • verifies whether attacks are truly possible
  • filters hallucinated or low-confidence outputs

Anything below a confidence threshold (~40%) is dropped.

The goal is not to β€œfind everything.”

It’s to only surface things that are actually exploitable.

πŸ’‘ What surprised me

A few things stood out while building this:

  • Most real vulnerabilities only appear at interaction points between files, not within a single file
  • LLMs are surprisingly good at generating attack paths, but unreliable without a validation layer
  • The hardest problem wasn’t detection β€” it was noise control
  • Without a β€œGuardian” layer, the system becomes mostly hallucinated security reports very quickly

πŸ”’ Privacy / Local-first design

Coming from fintech, sending proprietary code to external APIs is not acceptable.

So SentinAI is built to run:

  • fully local via Ollama
  • or inside a private VPC
  • with no code leaving the environment

🌐 Web3 expansion (experimental)

I expanded it beyond Web2 into smart contract security:

  • Solana: missing signer checks, PDA misuse
  • EVM: reentrancy, tx.origin issues
  • Move: resource lifecycle bugs

Total coverage: ~45 vulnerability patterns.

🚧 Open questions (honest part)

I’m still actively figuring out:

  • how to reduce hallucinated exploit paths at scale
  • whether multi-agent reasoning actually holds up on large, messy codebases
  • where the boundary is between β€œuseful security reasoning” and β€œLLM storytelling”
  • whether this can realistically outperform hybrid static analysis + human review

One thing I’ve already noticed:

That’s still an open problem.

πŸ§ͺ Why I’m sharing this

This started as a β€œleave experiment” and somehow got ~200+ organic npm installs without any promotion.

I cleaned it up and open-sourced it mainly to:

  • get feedback from people deeper in security engineering
  • understand where this approach fails in real-world systems
  • see if β€œAI attacker reasoning” is actually useful in practice

πŸ”— If you want to poke at it

Curious to hear honest thoughts from people here:

  • Where would this completely break in real codebases?
  • Is multi-agent security reasoning actually useful, or just a fancy abstraction over static + LLM prompts?
  • Has anyone tried something similar in production security pipelines?
submitted by /u/itzdeeni
[link] [comments]
Received β€” 6 May 2026 ⏭ /r/netsec - Information Security News & Discussion

Binance fixed the IP whitelist gap β€” but the disclosure process is still broken

I recently re-tested an old Binance API finding I had reported through Bugcrowd.

The original issue was about Binance API IP whitelisting and derived listenKey stream credentials.

At the time, a listenKey could be created from a whitelisted IP and then used from a non-whitelisted IP to consume private user data streams.

That did not allow trading, withdrawals, or account takeover.

But it did allow real-time access to sensitive private stream data such as balances, orders, executions, positions, timing, and strategy behavior.

The core security argument was:

A derived credential should not be more portable than the credential that created it.

The report was rejected as β€œSocial Engineering” / β€œNot Applicable”.

I disagreed, because the relevant threat model was not β€œconvince the user to send a token”.

The realistic threat model was supply-chain compromise: malicious code running inside a trusted bot server, CI job, dependency, IDE workspace, or trading environment where API keys already exist.

I re-tested the behavior on May 5, 2026.

Result:

The old behavior appears to be gone.

Spot and Margin no longer use the old listenKey model. Futures still uses listenKey, but now appears to enforce the API key IP whitelist correctly. From a whitelisted IP the calls worked; from non-whitelisted Mullvad exits they failed with the expected IP restriction error.

That is good for users.

But it raises an uncomfortable disclosure-process question:

If a finding is β€œnot applicable” enough to reject, not acknowledge, and not reward β€” but technical enough to later fix β€” what should a healthy disclosure process do?

Full technical write-up, timeline, re-test setup, and raw outputs:

https://blog.technopathy.club/binance-fixed-the-ip-whitelist-gap-the-disclosure-process-is-still-broken

I am mainly interested in the process question here:

When a rejected report later disappears from production, should the program re-open it, acknowledge it, partially reward it, or leave it closed unless the researcher can prove direct causality?

submitted by /u/oliver-zehentleitner
[link] [comments]

Salesforce pentesting novel techniques- how to be an apex predator

In this blog post I introduced several novel techniques:

1.How to get all routes - no need to authenticate.

  1. How to get methods to fuzz from pages and not just the bootstrap JS files - the vast majority of methods are in those pages and not the JS files that existing tools and guides point to.

  2. How to parse "LWC" components and not just legacy components.

submitted by /u/lowlandsmarch
[link] [comments]
Received β€” 5 May 2026 ⏭ /r/netsec - Information Security News & Discussion

Major AI Clients Shipping With Broken OAuth Implementations

The majority of widely used AI clients like:

  • Claude Code
  • Claude Desktop
  • Cursor
  • LibreChat
  • Amazon Q CLI

have not implemented the critical refresh-token flow of the OAuth standard.

This is forcing developers to issue long lived tokens creating a serious security regression in an already solved problem.

This write up includes a matrix table of 14 major clients with notes linking to feature requests, pull requests, and multiple forum discussions.

It is not all gloom and doom though!

There is a work-around solution that security conscious users are using as a stop-gap also discussed, along with a best practices guide for developers implementing their own MCP OAuth Solution.

The plan is to update this reference on a monthly basis to track if there is any movement on this open requests.

submitted by /u/mhat
[link] [comments]

We probed 6,000 web apps for Stripe webhook signature checks. 1,542 don't bother

Quick note from a scanning project I've been running. We hit 6,000 web apps with a payment-bypass probe last week, sending a minimal fake `checkout.session.completed` event to common webhook paths (`/api/webhook/stripe`, `/api/payments/webhook`, etc.) without a `Stripe-Signature` header.

1,542 returned 200.

That means anyone with curl can fire a forged Stripe event at those endpoints and the server processes it as legitimate. Depending on what the handler does with it, the consequences range from "logs a fake event" to "marks attacker's account as paid" to "creates a confirmed order with no payment".

The split was roughly:

  • Custom domains (real production SaaS): ~720
  • Render: 198
  • Vercel: 142
  • Replit: 121
  • Railway, Fly, Heroku, others: ~360

Why so many?

The Stripe library makes signature verification a one-liner. Every framework has the canonical example. But the dev journey usually goes: build the route locally with a stub handler that just `console.log`s the event body, get the upgrade-the-user logic working, leave signature verification on the TODO, ship. Six months later nobody remembers it was ever a TODO.

The fix in Express:

\``js app.post('/api/webhook/stripe', express.raw({type: 'application/json'}), (req, res) => { const sig = req.headers['stripe-signature']; let event; try { event = stripe.webhooks.constructEvent( req.body, sig, process.env.STRIPE_WEBHOOK_SECRET); } catch (err) { return res.status(400).send(`Webhook Error: ${err.message}`); } // proceed with event res.json({received: true}); }); ````

The trap: `express.json()` globally parses the body before your handler sees it, leaving Stripe's library to compute the signature against parsed-then-stringified JSON, which never matches. Use `express.raw()` specifically on the webhook route, before any global JSON parser.

FastAPI / Python: read `await request.body()` directly, not `request.json()`. Same idea.

Caveats: a 200 response doesn't prove the app actually grants the attacker something. Some endpoints log every webhook for analytics and return 200 regardless. The 1,542 number is "endpoints accepting unsigned events", not "definitely exploitable". But the misconfiguration is real on its own.

Full writeup with the methodology and platform-by-platform breakdown: https://securityscanner.dev/blog/stripe-webhook-signature-bypass-1500-apps

Curious if anyone here has shipped a Stripe webhook recently and can double-check theirs.

submitted by /u/Most_Ad_394
[link] [comments]
Received β€” 4 May 2026 ⏭ /r/netsec - Information Security News & Discussion
❌