Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hmmm so you're ignoring the crux of my argument because it's convenient for you (h264 is comfortably small, AV1 is maybe too big, so between them might work). So anything that's related to why AV1 won't fit is pointless. They know that and are improving on it.

Your argument about your large amount of flops is odd. You would only store data that way if you needed everything on the same cycle. You say there's a multiplexor after that. Data storage + multiplexor is just memory. Could use a bram or lutram which would cut down on that dramatically, big if there's a need based on later processing which you haven't defined. And even then, that's for AV1 which isn't AV2 and may change



I’m ignoring h264 because it’s irrelevant in a discussion about AV2, for the reasons that I already brought up in my earlier reply. It’s like having a discussion about a Zen CPU and bringing up the 8088 architecture.

Let’s cut to the chase. AV2 will not be smaller than AV1 at all. The linked article doesn’t say that. The slides don’t say that either.

The only thing that could make somebody think that it’s smaller is the claim that all tools have been validated for hardware efficiency. The goal of this process is to make sure that none of the new tools make the HW unreasonably explode in size, not to make the codec smaller than before, because everyone knows that this is impossible if you want to increase compression ratio.

Let’s look at 2 of those new tools. MRLS: this adds multiple reference lines, just like I expected there would be. Boom! Much more complexity for neighbor handling. I also see more directions (more angles.) That also adds HW. The article mentions improved chroma from luma. Not unexpected because h266 already has that, and AV2 needs to compete against that. AV1 has a basic 2x2 block filter. I expect AV2 to have a more complex FIR filter, which makes things significantly harder for a HW implementation.

You are delusional if you think AV2 will be smaller than AV1.

The reason I brought up neighbor handling is because it’s so easy to estimate its resource requirements from first principles, not because it’s a huge part of a decoder. But if neighbors alone already make a smaller FPGA nearly impossible, it should be obvious that the whole decoder is ridiculous.

So… as for storing neighbors in RAM: if I’d bring this up at work, they’d probably send me home to take mental health break or something.

Neighbor processing lives right inside the critical latency loop. Every clock cycle that you add in that loop impacts performance. You need to update these neighbors after predicting every coding unit. Oh, and the article mentions that the CTB size (“super block” in AV2 parlance) has been increased from 128x128 to 256x256. Good luck area reducing that. :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: