Honey Tokens: Bait Credentials That Catch Breaches

I’ve been working on an experiment in AI-driven application security called SentinAI. I’m a backend engineer in fintech, and I spent part of my recent leave trying to explore a simple question:

Most SAST tools are basically metal detectors:
they’re great at catching obvious patterns like unsafe functions or missing headers.

But they struggle with the stuff that actually matters in real systems:

IDORs
authorization drift
multi-tenant isolation issues
broken middleware assumptions
cross-file logic flaws

Attackers don’t think in patterns.

They think in systems.

So I built something experimental to explore that gap.

🧠 The Architecture (3-Agent Loop)

Instead of a single LLM prompt (which tends to hallucinate easily), SentinAI uses a structured multi-agent flow:

1. The Architect

Maps the system:

routes
auth boundaries
data flows
trust assumptions

2. The Adversary 🥷

Tries to break it:

generates exploit paths
builds step-by-step attack chains
simulates real-world abuse scenarios

3. The Guardian 🛡️

Validates everything:

checks exploits against actual code context
verifies whether attacks are truly possible
filters hallucinated or low-confidence outputs

Anything below a confidence threshold (~40%) is dropped.

The goal is not to “find everything.”

It’s to only surface things that are actually exploitable.

💡 What surprised me

A few things stood out while building this:

Most real vulnerabilities only appear at interaction points between files, not within a single file
LLMs are surprisingly good at generating attack paths, but unreliable without a validation layer
The hardest problem wasn’t detection — it was noise control
Without a “Guardian” layer, the system becomes mostly hallucinated security reports very quickly

🔒 Privacy / Local-first design

Coming from fintech, sending proprietary code to external APIs is not acceptable.

So SentinAI is built to run:

fully local via Ollama
or inside a private VPC
with no code leaving the environment

🌐 Web3 expansion (experimental)

I expanded it beyond Web2 into smart contract security:

Solana: missing signer checks, PDA misuse
EVM: reentrancy, tx.origin issues
Move: resource lifecycle bugs

Total coverage: ~45 vulnerability patterns.

🚧 Open questions (honest part)

I’m still actively figuring out:

how to reduce hallucinated exploit paths at scale
whether multi-agent reasoning actually holds up on large, messy codebases
where the boundary is between “useful security reasoning” and “LLM storytelling”
whether this can realistically outperform hybrid static analysis + human review

One thing I’ve already noticed:

That’s still an open problem.

🧪 Why I’m sharing this

This started as a “leave experiment” and somehow got ~200+ organic npm installs without any promotion.

I cleaned it up and open-sourced it mainly to:

get feedback from people deeper in security engineering
understand where this approach fails in real-world systems
see if “AI attacker reasoning” is actually useful in practice

🔗 If you want to poke at it

npm: https://www.npmjs.com/package/sentinai-core
GitHub: https://github.com/itxDeeni/SentinAI-Core

Curious to hear honest thoughts from people here:

Where would this completely break in real codebases?
Is multi-agent security reasoning actually useful, or just a fancy abstraction over static + LLM prompts?
Has anyone tried something similar in production security pipelines?

submitted by /u/itzdeeni
[link] [comments]