Blog
-
Optimized B200 Matrix Multiplication
January 28, 2025
A deep dive into building a performant matmul kernel using NVIDIA's B200 GPU with persistent kernels, pipelining, TMA, and TMEM.
January 28, 2025
A deep dive into building a performant matmul kernel using NVIDIA's B200 GPU with persistent kernels, pipelining, TMA, and TMEM.