Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not really sure, and you can pull lots of funny examples where various models have progress & regressions dealing with such mundane simple math.

As recently as August "11.10 or 11.9 which is bigger" came up with the wrong answer on ChatGPT and was followed with lots of wrong justification for the wrong answer. Even follow up math question "what is 11.10 - 11.9" gave me the answer "11.10 - 11.9 equals 0.2"

We can quibble about what model I was using, or what edge case I hit, or how quick they fixed it.. but this is 2 years into the very public LLM hype wave so at some point I expect better.

It gives me pause in asking more complex math questions I cannot immediately verify results, in which case, again why would I pay for a tool to ask questions I already know the answer to?



This error is not nonsensical though as normal elementary kids would make similar error and with good episodic memory the agent will fix itself.


He did say "sometimes Einstein is on the other end, and sometimes it's a drunken child. You have no idea when you pick up the phone which way its going to go.", so I think that's still a valid thing for him to complain about.

LLMs totally violate our expectations for computers, by being a bit forgetful and bad at maths.


Yes, to put a point on it -

How many dollars per month would someone be willing to spend for a chatbot that has a 3rd graders ability at math? Personally, $0 for me.

But what if it's a PHD Math degrees ability at math? Tons, in some applications it could be worth $100s or $1000s in an enterprise license setting.

But what if it's unpredictably, imperceptibly question to question, 95% PHD and 5% 3rd grader? Again, for me - $0. (not 95% of $1000s, but truly, $0)




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: