ChatGPT's o3 Model Found Remote Zeroday in Linux Kernel Code
ChatGPT's o3 Model Found Remote Zeroday in Linux Kernel Code

ChatGPT's o3 Model Found Remote Zeroday in Linux Kernel Code

ChatGPT's o3 Model Found Remote Zeroday in Linux Kernel Code
ChatGPT's o3 Model Found Remote Zeroday in Linux Kernel Code
I hate AI. Why?
However
I also took the time to read the original blog post, and it is a fascinating story.
The author starts out with using an existing vulnerability as a benchmark for ChatGPT testing. They describe how they took the code specific to the vulnerability and packaged it for ChatGPT, how they formatted the query and what their results were. In 100 runs only 8 correctly identify the targeted vulnerability, the rest are false positives or claim that there are no vulnerabilities in the given code.
Then they take their test a step further and increase the amount of code shared with ChatGPT so that it also includes stuff of the module that had nothing to do with the original vulnerability. As expected, this larger input decreases performance and also reduces the vulnerability detection rate for the targeted vulnerability. However, in those 100 runs, another vulnerability was described that wasn't a false positive. An actual new vulnerability that the author didn't know about was discovered. Again, the signal to noise ratio is very low, and one has to sift through a lot of wrong reports to get a realistic one, but this proved that it could be used as a useful tool for helping to detect vulnerabilities.
I highly recommend reading the blog post.
As much as I like to be critical about AI, it doesn't help if we put our heads in the sand and act as if it never does something cool.
Interesting. I feel like the headline is still bad though. I get why they ran with it, at least — "ChatGPT finds kernel exploit" is more interesting and gets more clicks than "Monkey finally writes Shakespeare."
but this proved that it could be used as a useful tool for helping to detect vulnerabilities.
I think "could" is doing some heavy lifting there.
In 100 runs only 8 correctly identify the targeted vulnerability, the rest are false positives or claim that there are no vulnerabilities in the given code. ... [The] signal to noise ratio is very low, and one has to sift through a lot of wrong reports to get a realistic one.
It was right 8% of the time when presented the least amount of input to find a known bug. Then, when they opened it up to more of the codebase, its performance decreased.
I'm not going to use something that's wrong over 92% of the time. That's insane. That's like saying my Magic 8 Ball "could be used as a useful tool for helping to detect vulnerabilities." The fucking rubber ducky on my desk has a more reliable clearance rate.
This is literally the very first experiment in this use case, done by a single person on a model that wasn't specifically designed for this. The fact that it is able to formulate a correct response at all in this situation impresses me.
It would be easy to criticize this if it were the endpoint and this was being advertised as a tool for vulnerability research, but as discussed at the end of the post, this "quick little test" shows both initial promising results and had the fortunate byproduct of actually revealing a new vulnerability. By no means is it implied that it is now ready for use in this field.
The issue with hallucinations is one that in my opinion is never going to be totally fixed. That is why I hate the use of AI as a final arbiter of truth, which is sadly how a lot of people use it (I'll quickly ask ChatGPT) and companies advertise it. What it is good at however, is coming up with plausible ideas, and in this case having an indication for things to check in code can be a great tool to discover new stuff, as is literally the case for this security researcher finding a new vulnerability after auditing the module themselves.
@Kissaki In another thread, people are mocking AI because the free language models they are using are bad at drawing accurate maps. "AI can't even do geography". Anything an AI says can't be trusted, and AI is vastly inferior to human ability.
These same people haven't figured out the difference between using a language AI to draw a map, and simply asking it a geography question.
Searching for answers and creating maps are both completely unrelated to scanning source code for vulnerabilities. What is the point of this comment?
Your comment made me very curious, and I dunno if this is hilarious or disappointing.
@callouscomic I lean towards disappointing. We are literally surrounded at all times by amazing technology, but the default position is still "technology bad" 🙄
It reminds me of the concerns people had when trains were being invented, people refused to ride on them because "God never meant for us to travel faster than 20 km/h" or that such breakneck speeds would somehow cause harm to a woman's uterus or ovaries.
Daniel Stenberg has banned AI-edited bug reports from cURL because they were exclusively nonsense and just wasted their time. Just because it gets a hit once doesn’t mean it’s good at this either.
It does show that it can be a useful tool, though.
Here, the security researcher was evaluating it and stumbled upon a previously undiscovered security bug. Obviously, they didn't let the AI create the bug report without understanding it. They verified the answer and took action themselves, presumably analyzing, verifying, and reporting in a professional and respectful way.
The cURL AI spam is an issue at the opposite side of that. But doesn't really tell us anything about capabilities. It tells us more about people. In my eyes, at least.
@2xsaiko That is a poorly made AI model, then. Whoever put that system in place didn't train the model properly. In fact, I'm going to guess that you chose a random model like ChatGPT or llama or Gemini.
Or you might not even realize that you need a model specifically trained to handle the kind of thing you are asking.
That isn't a limitation of AI, that is human error. Do you think people are just pretending it works or something?
Like, I get that there's people who are mocking AI for the wrong reasons, and they're silly for that, but there are very real reasons to dislike AI in many applications.
Would chatgpt be able to do this if their dataset had consisted only of ethically obtained data where the authors had provided consent? My money is on no, at least not yet. The technology is in its infancy and has powerful potential, but is having its progress boosted through highly unethical means.
I'm so very much for the concept of AI, its a monumental technology space at its core. But it needs to be done right, and I fear that it never will be, and we will have to live with the sins of the existing models forever. I hope I will be wrong.
If we can reach a future where models are trained on entirely consensual data and the environmental impact of their training and usage isn't as dire, I'd be so happy.
@apotheotic As for things like creating images in the style of a specific artist, that is not plagiarism unless you are asking for a perfect replica of a specific art piece and claiming it as your own original work.
All artists imitate the styles they find appealing, if you paint a Van Gogh style painting it isn't plagiarism of Van Gogh. Likewise, if I were to imitate Van Gogh's style using an AI, the resulting image would be my original work and not Van Gogh's creation.
@apotheotic The issue with copyright is an inevitable misstep that was bound to happen while figuring out this technology. However, some of criticisms aren't about ethical issues surrounding copyright, they are about the marketability of skills (such as painting) that you either had to learn yourself or otherwise needed to pay someone to do for you.
Now you can do that with an AI. Great for disabled people who can create freely now, bad for the artists who exploited that for financial gain.
There are 10 kinds of people: those who think they understand neural networks, those who try to understand neural networks, and those whose neural networks can't spot the difference.
Not a coincidence the amount of people who are bad at languages, communication, learning, or teaching. On the bright side, new generations are likely to be forced to get better.
@jarfil I think it's unavoidable instict. In our ancestral environment, it was basic survival sense to fear the unknown and assume it could be dangerous. Caution just makes sense in that scenario.
There hasn't been enough time for our genes to adapt to our new, radically different environment. So people will continue to react to technological advances as if a tiger could leap out at any moment and maul them to death. Even I experience a vague unease, and I love technology.