Normal view
-
/r/netsec - Information Security News & Discussion
- Honey Tokens: Bait Credentials That Catch Breaches
-
/r/netsec - Information Security News & Discussion
- Bypassing Bitlocker under 5 min using downgrade attack on CVE-2025-48804
-
/r/netsec - Information Security News & Discussion
- An AI security auditor that red-teams PRs to find exploits, not just patterns (open-source + Ollama support)
An AI security auditor that red-teams PRs to find exploits, not just patterns (open-source + Ollama support)
Hey everyone,
Iβve been working on an experiment in AI-driven application security called SentinAI. Iβm a backend engineer in fintech, and I spent part of my recent leave trying to explore a simple question:
Most SAST tools are basically metal detectors:
theyβre great at catching obvious patterns like unsafe functions or missing headers.
But they struggle with the stuff that actually matters in real systems:
- IDORs
- authorization drift
- multi-tenant isolation issues
- broken middleware assumptions
- cross-file logic flaws
Attackers donβt think in patterns.
They think in systems.
So I built something experimental to explore that gap.
π§ The Architecture (3-Agent Loop)
Instead of a single LLM prompt (which tends to hallucinate easily), SentinAI uses a structured multi-agent flow:
1. The Architect
Maps the system:
- routes
- auth boundaries
- data flows
- trust assumptions
2. The Adversary π₯·
Tries to break it:
- generates exploit paths
- builds step-by-step attack chains
- simulates real-world abuse scenarios
3. The Guardian π‘οΈ
Validates everything:
- checks exploits against actual code context
- verifies whether attacks are truly possible
- filters hallucinated or low-confidence outputs
Anything below a confidence threshold (~40%) is dropped.
The goal is not to βfind everything.β
Itβs to only surface things that are actually exploitable.
π‘ What surprised me
A few things stood out while building this:
- Most real vulnerabilities only appear at interaction points between files, not within a single file
- LLMs are surprisingly good at generating attack paths, but unreliable without a validation layer
- The hardest problem wasnβt detection β it was noise control
- Without a βGuardianβ layer, the system becomes mostly hallucinated security reports very quickly
π Privacy / Local-first design
Coming from fintech, sending proprietary code to external APIs is not acceptable.
So SentinAI is built to run:
- fully local via Ollama
- or inside a private VPC
- with no code leaving the environment
π Web3 expansion (experimental)
I expanded it beyond Web2 into smart contract security:
- Solana: missing signer checks, PDA misuse
- EVM: reentrancy, tx.origin issues
- Move: resource lifecycle bugs
Total coverage: ~45 vulnerability patterns.
π§ Open questions (honest part)
Iβm still actively figuring out:
- how to reduce hallucinated exploit paths at scale
- whether multi-agent reasoning actually holds up on large, messy codebases
- where the boundary is between βuseful security reasoningβ and βLLM storytellingβ
- whether this can realistically outperform hybrid static analysis + human review
One thing Iβve already noticed:
Thatβs still an open problem.
π§ͺ Why Iβm sharing this
This started as a βleave experimentβ and somehow got ~200+ organic npm installs without any promotion.
I cleaned it up and open-sourced it mainly to:
- get feedback from people deeper in security engineering
- understand where this approach fails in real-world systems
- see if βAI attacker reasoningβ is actually useful in practice
π If you want to poke at it
Curious to hear honest thoughts from people here:
- Where would this completely break in real codebases?
- Is multi-agent security reasoning actually useful, or just a fancy abstraction over static + LLM prompts?
- Has anyone tried something similar in production security pipelines?
[link] [comments]