Skip Navigation
4 comments
  • It's risky but the risk is towards the users, and the profits are towards the companies.

    I asked OpenAI, Google, and Meta what they are doing to defend against prompt injection attacks and hallucinations. Meta did not reply in time for publication, and OpenAI did not comment on the record.

    Discourse analysis tip: what is not said is sometimes more important than what is said. The fact that they refused to reply hints that the reply would be against their best interests, either lying in a liable way or saying the truth and potentially ruining their investment.

    The reason why Google actually answered it ("Google confirmed it [prompt injection] is not a solved problem[...]") is likely related to saying "it's an experiment" -

    Regarding AI’s propensity to make things up, a spokesperson for Google did say the company was releasing Bard as an “experiment,” and that it lets users fact-check Bard’s answers using Google Search. “If users see a hallucination or something that isn’t accurate, we encourage them to click the thumbs-down button and provide feedback. That’s one way Bard will learn and improve,” the spokesperson said.

    Can we [people in general] stop pretending that those models "learn"? Giving it feedback is like telling my cat "don't scratch it!" - it might work for that specific case, but it won't solve the underlying issue, so the model/cat will keep hallucinating/scratching something else. The hallucinations are not individual flaws, they're issues surfacing from the underlying tech: language associates morphemes (tokens) with meaning, not just a token with another! Linguists have been talking about this for at least a century, but those "tech bros" are still trying to model language without it. (Microsoft is apparently doing some progress in this regard though. I can look for the quote if anyone wants.)

    • I agree with your point overall in terms of AI not actually learning (I'd describe it as optimizing).

      However, I will say that inferring from what is not said is a tricky one to apply generally, which you do in your reply by jumping to conclusions as follows:

      The fact that they refused to reply hints that the reply would be against their best interests, either lying in a liable way or saying the truth and potentially ruining their investment.

      This is dangerous, can be used disingenuously and I discourage using it in our discourse.

      • I do agree with you that it's tricky to apply, but it's still useful regardless; and while the danger that you're talking about is real, it has more to do with the certainty assigned to the inference than with the inference itself.

        That's why I said it "hints that the reply..." instead of "means", or that the reason that Google answered is "likely related" - both words are there for a good reason, to highlight that this is not a conclusion. As in: it might be wrong, and both words acknowledge it.

        Even not being solid info but just an inference, I still felt worth sharing for two reasons, that make the lack of reply noteworthy:

        • Google, OpenAI and Meta/Facebook are roughly in the same situation (contacted by the author due to LLM development), and yet only one answered. Why?
        • Politicians and corporations are generally eager to advertise their stuff, but extra careful with what they say on-record.
  • TL;DR? Tech companies shouldn’t be so complacent about the purported “inevitability” of AI tools. Ordinary people don’t tend to adopt technologies that keep failing in annoying and unpredictable ways, and it’s only a matter of time until we see the hackers using these new AI assistants maliciously. Right now, we are all sitting ducks.

    I don’t know about you, but I intend to wait a little longer before letting this generation of AI systems snoop around in my email.

    This is probably the longest TL;DR I've ever read, and that tells something about my reading impression of this text.