A New Attack Impacts ChatGPT—and No One Knows How to Stop It
A New Attack Impacts ChatGPT—and No One Knows How to Stop It
![](https://lemdro.id/pictrs/image/e34621fb-e7d2-4cfe-8b20-1ba2fdcd9d29.jpeg?format=webp&thumbnail=128)
Researchers found a simple way to make ChatGPT, Bard, and other chatbots misbehave, proving that AI is hard to tame.
![A New Attack Impacts ChatGPT—and No One Knows How to Stop It](https://lemdro.id/pictrs/image/e34621fb-e7d2-4cfe-8b20-1ba2fdcd9d29.jpeg?format=webp)
A New Attack Impacts ChatGPT—and No One Knows How to Stop It
Researchers found a simple way to make ChatGPT, Bard, and other chatbots misbehave, proving that AI is hard to tame.
By "attack" they mean "jailbreak". It's also nothing like a buffer overflow.
The article is interesting though and the approach to generating these jailbreak prompts is creative. It looks a bit similar to the unspeakable tokens thing: https://www.vice.com/en/article/epzyva/ai-chatgpt-tokens-words-break-reddit
That seems like they left debugging code enabled/accessible.
That seems like they left debugging code enabled/accessible.
No, this is actually a completely different type of problem. LLMs also aren't code, and they aren't manually configured/set up/written by humans. In fact, we kind of don't really know what's going on internally when performing inference with an LLM.
The actual software side of it is more like a video player that "plays" the LLM.