
Vizuara AI Labs
Vizuara Kernel Engineering
From silicon to speculative decoding — write GPU kernels that actually run modern LLMs. A worklog from the silicon up to FlashAttention, NVFP4, DeepSeek's DSpark, and AI-generated kernels — every step measured, profiled, and drawn by hand.
01
The Book
The full knowledge base — a 72-chapter illustrated worklog from the silicon up to FlashAttention, NVFP4 and DeepSeek's DSpark. Free to read, forever.
02
The Workshop
Vizuara's live Kernel Engineering cohort: 8 foundational lectures + 6 deep-dive workshops on modern kernel-inference topics, with the full book included.
03
Projects
Build real kernels with your hands — the GPU-Puzzles track, a GEMM you take to 94% of cuBLAS, FlashAttention from scratch, and the You-vs-the-machine capstone.
04
Interactive
Practice, not just read: per-section quizzes, the guided GPU-Puzzles track, and a growing set of hands-on kernel challenges.
Built around what you're actually hired to do
Matmul from scratch to 94% of cuBLAS · the same ladder on tensor cores · reading SASS & Nsight Compute · TMA/WGMMA on Hopper · NVFP4 & TMEM on Blackwell · Triton and real CUTLASS · FlashAttention · the vLLM debugging workflow · and LLM-driven kernel search — knowing where it wins and where it still fails.
The kernel engineer's skill map →