You can try this yourself with GPT-4. I have, and it fails every time. Earlier GPT-4 versions, via the API, also fail every time. Claude reasons before it answers, but if you ask it to say yes or no only, it fails. Bard is the only one that gets it right, right off the bat