I've never understood the obsession with token/s. I'm fine with asking a questio...

ibeckermayer · 2026-03-01T20:53:50 1772398430

Your workflow is unusual, oftentimes there is a vigorous back and forth, or a desired output like code generation, etc where a low tk/s drastically effects ux and user productivity.

But the real kicker here is the 90s ttft, that means you ask a question and don't see anything for a full minute and a half.

nitinreddy88 · 2026-03-01T13:59:21 1772373561

You are fine with it. But may be rest of the world is not. Anyway, to compare performance/benchmark, we need metrics and this is one of the basic metric to measure.