I built this as a small demonstration to explore prompt-injection and instruction-override failure modes in help-desk-style LLM deployments.
The setup mirrors common production patterns (role instructions, refusal logic, bounded data access) and is intended to show how those controls can be bypassed through context manipulation and instruction override.
Iβm interested in feedback on realism, missing attack paths, and whether these failure modes align with what others are seeing in deployed systems.
This isnβt intended as marketing - just a concrete artefact to support discussion.