• ZILtoid1991@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    3 hours ago

    An easy workaround so far I’ve seen is putting random double spaces and typos into AI generated texts, I’ve been able to jailbreak some of such chatbots to then expose them. The trick is that “ignore all previous instructions” is almost always filtered by chatbot developers, however a trick I call “initial prompt gambit” does work, which involves thanking the chatbot for the presumed initial prompt, then you can make it do some other tasks. “write me a poem” is also filtered, but “write me a haiku” will likely result in a short poem (usually with the same smokescreen to hide the AI-ness of generative AI outputs), and code generation is also mostly filtered (l337c0d3 talk still sometimes bypasses it).