FreshRSS

πŸ”’
❌ Secure Planet Training Courses Updated For 2019 - Click Here
There are new available articles, click to refresh the page.
Yesterday β€” December 17th 2025Your RSS feeds

New research confirms what we suspected: every LLM tested can be exploited

Just finished reading ActiveFence’s emerging threats assessment on 7 major models across hate speech, disinfo, fraud, and CSAM-adjacent prompts.

Key findings are: 44% of outputs were rated risky, 68% of unsafe ones were hate-speech-related, and only a single model landed in the safe range.

What really jumps out is how different vendors behave per abuse area (fraud looks relatively well-covered, hate and child safety really don’t).

For those doing your own evals/red teaming: are you seeing similar per-category gaps? Has anyone brought in an external research partner like ActiveFence to track emerging threats over time?

submitted by /u/CortexVortex1
[link] [comments]
❌