Skip Navigation
3 comments
  • By "attack" they mean "jailbreak". It's also nothing like a buffer overflow.

    The article is interesting though and the approach to generating these jailbreak prompts is creative. It looks a bit similar to the unspeakable tokens thing: https://www.vice.com/en/article/epzyva/ai-chatgpt-tokens-words-break-reddit

  • That seems like they left debugging code enabled/accessible.

    • That seems like they left debugging code enabled/accessible.

      No, this is actually a completely different type of problem. LLMs also aren't code, and they aren't manually configured/set up/written by humans. In fact, we kind of don't really know what's going on internally when performing inference with an LLM.

      The actual software side of it is more like a video player that "plays" the LLM.