diff --git a/content/posts/blackwell_datacenter_vs_geforce.mdx b/content/posts/blackwell_datacenter_vs_geforce.mdx index 8993d22..44467fb 100644 --- a/content/posts/blackwell_datacenter_vs_geforce.mdx +++ b/content/posts/blackwell_datacenter_vs_geforce.mdx @@ -6,4 +6,9 @@ tags: ['Nvidia', 'GPU', 'GPU Kernel'] --- -I'm a proud owner for an RTX 5090 FE. \ No newline at end of file +I'm a proud owner for an RTX 5090 FE. I occasionally play games on it, but it's mostly used for ML workloads. +I jumped on the 50-series especially for the fp4 support on their 5th generation blackwell tensor cores, cause I'm actively working on some pretty exciting low precision computing. +Imagine my surprise when I was perusing the GPU mode discord and find people calling the GeForce blackwell cards "Fake blackwell"?!! +Looking online, I found next to no resources on the difference. I foolishly assumed that my GeForce card (arch=sm_120) would contain all the features from the datacenter cards (arch=sm_100), as +it seemed to be a later arch. No, Nvidia just made it more confusing, and obscured the technical details extremely well. Going through the [cuda documentation](https://docs.nvidia.com/cuda/parallel-thread-execution/), +you'll see that the new tensor core gen 5 instructions are only compatible with `sm_100[a-f]` (Datacenter Blackwell) and `sm_101` (Jetson Thor). What does this mean? That involved a lot more digging. \ No newline at end of file diff --git a/public/images/1_blackwell_dc_vs_gf/Pasted Image (Copy 1).png b/public/images/1_blackwell_dc_vs_gf/Pasted Image (Copy 1).png new file mode 100644 index 0000000..bf32190 Binary files /dev/null and b/public/images/1_blackwell_dc_vs_gf/Pasted Image (Copy 1).png differ diff --git a/public/images/1_blackwell_dc_vs_gf/Pasted Image (Copy 2).png b/public/images/1_blackwell_dc_vs_gf/Pasted Image (Copy 2).png new file mode 100644 index 0000000..0c843a7 Binary files /dev/null and b/public/images/1_blackwell_dc_vs_gf/Pasted Image (Copy 2).png differ diff --git a/public/images/1_blackwell_dc_vs_gf/Pasted Image.png b/public/images/1_blackwell_dc_vs_gf/Pasted Image.png new file mode 100644 index 0000000..7fa0352 Binary files /dev/null and b/public/images/1_blackwell_dc_vs_gf/Pasted Image.png differ diff --git a/public/images/1_blackwell_dc_vs_gf/nvtop_b200.png b/public/images/1_blackwell_dc_vs_gf/nvtop_b200.png new file mode 100644 index 0000000..5a6fae8 Binary files /dev/null and b/public/images/1_blackwell_dc_vs_gf/nvtop_b200.png differ