Skip Navigation

Researchers say they had a ‘100% attack success rate’ on jailbreak attempts against Chinese AI DeepSeek

Archived

Researchers claim they had a ‘100% attack success rate’ on jailbreak attempts against Chinese AI DeepSeek

"DeepSeek R1 was purportedly trained with a fraction of the budgets that other frontier model providers spend on developing their models. However, it comes at a different cost: safety and security," researchers say.

A research team at Cisco managed to jailbreak DeepSeek R1 with a 100% attack success rate. This means that there was not a single prompt from the HarmBench set that did not obtain an affirmative answer from DeepSeek R1. This is in contrast to other frontier models, such as o1, which blocks a majority of adversarial attacks with its model guardrails.

...

In other related news, experts are cited by CNBC that DeepSeek’s privacy policy “isn’t worth the paper it is written on."

...

3 comments
  • Its crazy how many less than a week old accounts are putting out anti deepseek articles and comments.

    Almost as if they have an agenda.

  • Compared to other frontier models, DeepSeek R1 lacks robust guardrails

    Oh no, it's easy to make the ai running on my computer do what I want it to do, the horror.

    There's some irony that the media has been bleating about it being too censored, now they are complaining it's not censored enough.

  • So the final criticism, "Chinese censorship" argument can be thrown out of the window.

    In other related news, experts are cited by CNBC that DeepSeek’s privacy policy “isn’t worth the paper it is written on."

    Ok, run it yourself.