Can someone ELI5 why the safetensor file is 23.8 GB, given the 12B parameter mod...

sangwulee · 2025-07-31T21:00:27 1753995627

Quick napkin math assuming bfloat16 format : 1B * 16 bits = 16B bits = 2GB. Since it's a 12B parameter model, you get around ~24GB. Downcasting to bfloat16 from float32 comes with pretty minimal performance degradation, so we uploaded the weights in bfloat16 format.

piperswe · 2025-07-31T20:55:08 1753995308

A parameter can be any size float. Lots of downloadable models are FP8 (8 bits per parameter), but it appears this model is FP16 (16 bits per parameter)

Often, the training is done in FP16 then quantized down to FP8 or FP4 for distribution.

dragonwriter · 2025-08-01T01:10:23 1754010623

I think they are bfloat16, not FP16, but they are both 16bpw formats, so it doesn't make a size difference.

iyn · 2025-08-01T08:38:44 1754037524

Wiki article on bfloat16 for reference, since it was new to me: https://en.wikipedia.org/wiki/Bfloat16_floating-point_format

Tokumei-no-hito · 2025-08-01T06:13:27 1754028807

pardon the ignorance but it's the first time I've heard of bfloat16.

i asked chat for an explanation and it said bfloat has a higher range (like fp32) but less precision.

what does that mean for image generation and why was bfloat chosen over fp?

dragonwriter · 2025-08-01T09:25:03 1754040303

My fuzzy understanding, and I'm not at all an expert on this, that the main benefit is that bf16 is less prone to overflow/underflow during calculation, which is a source of bigger problems in both training and inference than the simple loss of precision, so once it became widely supported, it became a commonly-preferred format for models (whether image gen or otherwise) over FP16.

petercooper · 2025-07-31T20:54:10 1753995250

That's a good ballpark for something quantized to 8 bits per parameter. But you can 2x/4x that for 16 and 32 bit.

7734128 · 2025-07-31T21:02:09 1753995729

I've never seen a 32 bit model. There's bound to be a few of them, but it's hardly a normal precision.

zamadatix · 2025-07-31T21:21:09 1753996869

Some of the most famous models were distributed as F32, e.g. GPT-2. As things have shifted more towards mass consumption of model weights it's become less and less common to see.

nodja · 2025-07-31T22:35:10 1754001310

> As things have shifted more towards mass consumption of model weights it's become less and less common to see.

Not the real reason. The real reason is that training has moved to FP/BF16 over the years as NVIDIA made that more efficient in their hardware, the same reason you're starting to see some models being released in 8bit formats (deepseek).

Of course people can always quantize the weights to smaller sizes, but the master versions of the weights is usually 16bit.

petercooper · 2025-07-31T22:09:11 1753999751

And on the topic of image generation models, I think all the Stable Diffusion 1.x models were distributed in f32.