PyTorch

A thumbnail image

Understanding Triton Kernels from First Principles

A deep dive into how Triton kernels work, explained from absolute basics to complete understanding. …

A thumbnail image

Under the Hood: How PyTorch Chooses Attention Kernels and Why It Matters for Performance

A deep dive into PyTorch’s attention kernel selection and what each choice means for your …

A thumbnail image

From Theory to Practice: Quantization and Dequantization Made Simple

Quantization transforms floating-point values (‘float32’) into lower-precision formats, such as …

A thumbnail image

Breaking Down Vision Transformers: A Code-Driven Explanation

In this article, I’ll break down the layers of a ViT step by step with code snippets, and a …

A thumbnail image

The Simple Path to PyTorch Graphs: Dynamo and AOT Autograd Explained

Graph acquisition in PyTorch refers to the process of creating and managing the computational graph …

A thumbnail image

Profiling ResNet Models with PyTorch Profiler for Performance Optimization

In the realm of deep learning, model performance is paramount. Whether you’re working on image …

A thumbnail image

Warmup Wisdom: Accurate PyTorch Benchmarking Made Simple!

In the realm of PyTorch model benchmarking, achieving accurate results is paramount for gauging …