I'm not really sure, and you can pull lots of funny examples where various models have progress & regressions dealing with such mundane simple math.
As recently as August "11.10 or 11.9 which is bigger" came up with the wrong answer on ChatGPT and was followed with lots of wrong justification for the wrong answer. Even follow up math question "what is 11.10 - 11.9" gave me the answer "11.10 - 11.9 equals 0.2"
We can quibble about what model I was using, or what edge case I hit, or how quick they fixed it.. but this is 2 years into the very public LLM hype wave so at some point I expect better.
It gives me pause in asking more complex math questions I cannot immediately verify results, in which case, again why would I pay for a tool to ask questions I already know the answer to?
He did say "sometimes Einstein is on the other end, and sometimes it's a drunken child. You have no idea when you pick up the phone which way its going to go.", so I think that's still a valid thing for him to complain about.
LLMs totally violate our expectations for computers, by being a bit forgetful and bad at maths.
As recently as August "11.10 or 11.9 which is bigger" came up with the wrong answer on ChatGPT and was followed with lots of wrong justification for the wrong answer. Even follow up math question "what is 11.10 - 11.9" gave me the answer "11.10 - 11.9 equals 0.2"
We can quibble about what model I was using, or what edge case I hit, or how quick they fixed it.. but this is 2 years into the very public LLM hype wave so at some point I expect better.
It gives me pause in asking more complex math questions I cannot immediately verify results, in which case, again why would I pay for a tool to ask questions I already know the answer to?