image upload test
All checks were successful
Deploy Website / build-and-deploy (push) Successful in 26s
All checks were successful
Deploy Website / build-and-deploy (push) Successful in 26s
This commit is contained in:
@@ -19,5 +19,5 @@ you'll see that the new tensor core gen 5 instructions are only compatible with
|
||||
The blackwell tensor cores now support lower precision, namely FP6 and FP4, which the previous Hopper generation didn't. This enables extremely fast low precision matrix multiplications.
|
||||
To test out the nvfp4 <SideNote title="NVFP4">"Nvidia's low precision format. " </SideNote> support, I downloaded the cutlass repo and ran the nvfp4 matrix multiply example. Here's what I got
|
||||
|
||||
[A screenshot of a cutlass nvfp4 matmul benchmark](public/images/1_blackwell_dc_vs_gf/5090_65536.png)
|
||||

|
||||
|
||||
|
||||
Reference in New Issue
Block a user