This commit is contained in:
@@ -6,4 +6,9 @@ tags: ['Nvidia', 'GPU', 'GPU Kernel']
|
|||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
I'm a proud owner for an RTX 5090 FE.
|
I'm a proud owner for an RTX 5090 FE. I occasionally play games on it, but it's mostly used for ML workloads.
|
||||||
|
I jumped on the 50-series especially for the fp4 support on their 5th generation blackwell tensor cores, cause I'm actively working on some pretty exciting low precision computing.
|
||||||
|
Imagine my surprise when I was perusing the GPU mode discord and find people calling the GeForce blackwell cards "Fake blackwell"?!!
|
||||||
|
Looking online, I found next to no resources on the difference. I foolishly assumed that my GeForce card (arch=sm_120) would contain all the features from the datacenter cards (arch=sm_100), as
|
||||||
|
it seemed to be a later arch. No, Nvidia just made it more confusing, and obscured the technical details extremely well. Going through the [cuda documentation](https://docs.nvidia.com/cuda/parallel-thread-execution/),
|
||||||
|
you'll see that the new tensor core gen 5 instructions are only compatible with `sm_100[a-f]` (Datacenter Blackwell) and `sm_101` (Jetson Thor). What does this mean? That involved a lot more digging.
|
||||||
BIN
public/images/1_blackwell_dc_vs_gf/Pasted Image (Copy 1).png
Normal file
BIN
public/images/1_blackwell_dc_vs_gf/Pasted Image (Copy 1).png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 23 KiB |
BIN
public/images/1_blackwell_dc_vs_gf/Pasted Image (Copy 2).png
Normal file
BIN
public/images/1_blackwell_dc_vs_gf/Pasted Image (Copy 2).png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 23 KiB |
BIN
public/images/1_blackwell_dc_vs_gf/Pasted Image.png
Normal file
BIN
public/images/1_blackwell_dc_vs_gf/Pasted Image.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 41 KiB |
BIN
public/images/1_blackwell_dc_vs_gf/nvtop_b200.png
Normal file
BIN
public/images/1_blackwell_dc_vs_gf/nvtop_b200.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 114 KiB |
Reference in New Issue
Block a user