The Understanding Series

MCP is just a fancy API

Understand MCP as an Engineer. Build and debug one step-by-step in Python using FastMCP.

Jun 7 •

How to add structure to your LLM Applications using SGLang

Unpacking SGLang technicals, RadixAttention and fast decoding for Structured Output.

Apr 3 •

How does vLLM serve LLMs at scale?

The Online/Offline API modes, PagedAttention and distributed inference with Ray.

Mar 27 •

Unpacking NVIDIA Dynamo LLM Inference Framework

Everything you need to know about Dynamo. Code, components, concepts with diagrams and details.

Mar 20 •

Understanding LLM Optimization Techniques

Weights quantization using GPTQ, BitsAndBytes. Parallelism techniques, KV-caching, Flash Attention and Speculative Decoding.

Mar 1 •

Understanding LLM Inference

Explaining LLM pre-fill and generation phases, unpacking model configuration files from HuggingFace.

Feb 20 •

The AI/ML Engineer's starter guide to GPU Programming

#1 Programming on GPUs from scratch by implementing CUDA Kernels in C++, CuPy Python and OpenAI Triton.

Jan 30 •

Guide to understanding Concurrency & Parallelism in Python

Practical deep dive on what concurrency method to use for each type of workload in ML scenarios.

Oct 5, 2024 •

Stop using Python Dataclasses - start using Pydantic Models

Add data schemas and sanity to your data models. See how easily will using Pydantic streamline your data validation and serialization workflows.

Sep 17, 2024 •

Let's build Andrej Karpathy's BPETokenizer in Rust and use it from Python

Learn how to build a custom Rust library and generate Python Bindings. See how popular frameworks like Bytewax and Polars use the same workflow.

Sep 3, 2024 •

Python flexibility and C++ performance in one language — Mojo

The new Mojo Programming Language. LLVM and MLIR as core compiler frameworks. How to test Llama 2 in pure Mojo.

Aug 27, 2024 •

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts