/r/netsec - Information Security News & Discussion
Writeup on a defensive technique for constraining LLM agent database access:
- The core idea: instead of detecting bad queries at runtime, make them structurally inexpressible via object-capabilities.
- Live CTF: two DB agents guarding bitcoin wallets -- one protected by system prompt (already broken), one by capability layer (~$1K still standing).
Interested in feedback on the threat model. Code is open source.
submitted by
/u/ryanrasti [link] [comments]