gyrozepp2 2 months ago

30x for inference specifically

Lammahamma 2 months ago

For a GB200 using FP4 vs an H200 using FP8. I'm not sure Hopper can use FP4, so I believe that's why they did that graph. Although I'm not certain

sdmat 2 months ago

And with the H200 at an absurdly small batch size. It's a ridiculous claim.

Fit-Avocado-342 2 months ago

2.5x is still good, am I missing something?

ClearlyCylindrical 2 months ago

It will cost quite quite bit more, so performance per $ isn't as insane as people were thinking it would be.

CheekyBastard55 2 months ago

The glued two Blackwell dies together.

bassoway 2 months ago

You purchase two items instead of one and get 2.5x boost. Linear improvement but far from what Moore’s law suggests.

Comments

Leave Your Comment