AI adversarial poetry (20251124)
Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models
this is one of the most interesting things i’ve read in a while. this sent me down a bit of a rabbit hole that involved plato’s the republic and a harsh reminder that there are all sorts of hacks that cause us to suspend rational review and shortcircuit our more reasonable processes.
it kind of feels like we’re tickling some interesting parallel behaviors in our own noggin’s.