Whats important is VRAM, not system RAM. The 4090 has 16gb of VRAM so you'll be ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		weberer 7 months ago \| parent \| context \| favorite \| on: My 2.5 year old laptop can write Space Invaders in... Whats important is VRAM, not system RAM. The 4090 has 16gb of VRAM so you'll be limited to smaller models at decent speeds. Of course, you can run models from system memory, but your tokens/second will be orders of magnitude slower. ARM Macs are the exception since they have unified memory, allowing high bandwidth between the GPU and the system's RAM.

regularfry 7 months ago | [–]

Yes and no. The 4090 has 24GB, not 16; but with a big MoE you're not getting everything in there anyway. In that case you really want all the weights in RAM so that swapping experts in isn't a load from disk.

It's not as good as unified RAM, but it's also workable.

throwaway0123_5 7 months ago | [–]

iirc 4090s have 24GB

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact