Blog
Posts, notes, and articles.
- 0 - /blog/inside-vector-databases-building-retrieval-augmented-systems-that-scale/ (section=blog)
- 1 - /blog/learned-indexes-when-models-replace-btrees/ (section=blog)
- 2 - /blog/the-100microsecond-rule-why-tail-latency-eats-your-throughput-and-how-to-fight-back/ (section=blog)
- 3 - /blog/the-quiet-calculus-of-probabilistic-commutativity/ (section=blog)
- 4 - /blog/the-hidden-backbone-of-parallelism-how-prefix-sums-power-distributed-computation/ (section=blog)
- 5 - /blog/gpudirect-storage-in-2025-optimizing-the-end-to-end-data-path/ (section=blog)
- 6 - /blog/mpi-vs.-openmp-in-2025-where-each-wins/ (section=blog)
- 7 - /blog/from-mapreduce-to-spark-the-arc-of-data-parallel-systems/ (section=blog)
- 8 - /blog/auditing-the-algorithm-building-a-responsible-ai-pipeline-that-scales/ (section=blog)
- 9 - /blog/scheduling-trading-latency-for-throughput-and-back-again/ (section=blog)
- 10 - /blog/exactly-once-in-streaming-what-it-means-and-how-systems-achieve-it/ (section=blog)
- 11 - /blog/tuning-cuda-with-the-gpu-memory-hierarchy/ (section=blog)
- 12 - /blog/write-ahead-logging-the-unsung-hero-of-database-durability/ (section=blog)
- 13 - /blog/bloom-filters-and-probabilistic-data-structures-trading-certainty-for-speed/ (section=blog)
- 14 - /blog/seeing-in-the-dark-observability-for-edge-ai-fleets/ (section=blog)
- 15 - /blog/adaptive-feature-flag-frameworks-for-hyper-growth-saas/ (section=blog)
- 16 - /blog/lock-free-data-structures-concurrency-without-the-wait/ (section=blog)
- 17 - /blog/amdahls-law-vs.-gustafsons-law-what-they-really-predict/ (section=blog)
- 18 - /blog/unicode-and-character-encoding-from-ascii-to-utf-8-and-beyond/ (section=blog)
- 19 - /blog/countdown-to-quantum-migrating-an-enterprise-to-post-quantum-cryptography/ (section=blog)
- 20 - /blog/sealing-the-supply-chain-zero-trust-build-pipelines-that-scale/ (section=blog)
- 21 - /blog/memory-allocators-from-malloc-to-modern-arena-allocators/ (section=blog)
- 22 - /blog/reverse-indexing-and-inverted-files-how-search-engines-fly/ (section=blog)
- 23 - /blog/latency-aware-edge-inference-platforms-engineering-consistent-ai-experiences/ (section=blog)
- 24 - /blog/keeping-the-model-awake-building-a-self-healing-ml-inference-platform/ (section=blog)
- 25 - /blog/tcp-congestion-control-from-slow-start-to-bbr/ (section=blog)
- 26 - /blog/floating-point-how-computers-represent-real-numbers/ (section=blog)
- 27 - /blog/garbage-collection-algorithms-from-mark-and-sweep-to-zgc/ (section=blog)
- 28 - /blog/timeouts-retries-and-idempotency-keys-a-practical-guide/ (section=blog)
- 29 - /blog/teaching-graphql-to-cache-at-the-edge/ (section=blog)
- 30 - /blog/designing-crdt-powered-collaboration-platforms-that-stay-consistent/ (section=blog)
- 31 - /blog/cpu-caches-and-cache-coherence-the-memory-hierarchy-that-makes-modern-computing-fast/ (section=blog)
- 32 - /blog/virtual-memory-and-page-tables-how-modern-systems-manage-memory/ (section=blog)
- 33 - /blog/branch-prediction-and-speculative-execution-how-modern-cpus-gamble-on-the-future/ (section=blog)
- 34 - /blog/b-trees-and-lsm-trees-the-foundations-of-modern-storage-engines/ (section=blog)
- 35 - /blog/instrumenting-without-spying-privacy-preserving-telemetry-at-scale/ (section=blog)
- 36 - /blog/deterministic-monorepo-ci-platforms-engineering-consistency-at-scale/ (section=blog)
- 37 - /blog/system-calls-the-gateway-between-user-space-and-kernel/ (section=blog)
- 38 - /blog/cachefriendly-data-layouts-aos-vs.-soa-and-the-hybrid-inbetween/ (section=blog)
- 39 - /blog/raft-fastcommit-and-prevote-in-practice/ (section=blog)
- 40 - /blog/safe-rollback-strategies-for-distributed-databases/ (section=blog)
- 41 - /blog/compiler-optimizations-from-source-code-to-fast-machine-code/ (section=blog)
- 42 - /blog/merkle-trees-and-contentaddressable-storage/ (section=blog)
- 43 - /blog/consistent-hashing-distributing-data-across-dynamic-clusters/ (section=blog)
- 44 - /blog/tuning-the-dial-adaptive-consistency-at-planet-scale/ (section=blog)
- 45 - /blog/when-data-centers-learned-to-sleep-energy-aware-scheduling-in-practice/ (section=blog)
- 46 - /blog/speculative-prefetchers-designing-memory-systems-that-read-the-future/ (section=blog)

Inside Vector Databases: Building Retrieval-Augmented Systems that Scale
2025-10-26How modern vector databases ingest, index, and serve embeddings for production retrieval-augmented generation systems without falling over.

Learned Indexes: When Models Replace B‑Trees
2025-10-04A practitioner's guide to learned indexes: how they work, when they beat classic data structures, and what it takes to ship them without getting paged.

The 100‑Microsecond Rule: Why Tail Latency Eats Your Throughput (and How to Fight Back)
2025-10-04A field guide to taming P99 in modern systems—from queueing math to NIC interrupts, from hedged requests to adaptive concurrency. Practical patterns, pitfalls, and a blueprint you can apply this week.

The Quiet Calculus of Probabilistic Commutativity
2025-09-27A practical calculus for quantifying when non-commutative operations in distributed systems can be safely executed without heavyweight coordination.

The Hidden Backbone of Parallelism: How Prefix Sums Power Distributed Computation
2025-09-21Discover how the humble prefix sum (scan) quietly powers GPUs, distributed clusters, and big data frameworks—an obscure but essential building block of parallel and distributed computation.

GPUDirect Storage in 2025: Optimizing the End-to-End Data Path
2025-09-16How modern systems move data from NVMe and object storage into GPU kernels with minimal CPU overhead and maximal throughput.

MPI vs. OpenMP in 2025: Where Each Wins
2025-07-04A practical guide to choosing between message passing and shared-memory parallelism for modern HPC and hybrid nodes.

From MapReduce to Spark: The Arc of Data-Parallel Systems
2025-05-19MapReduce taught fault-tolerant batch at scale; Spark generalized it with resilient distributed datasets (RDDs) and DAG scheduling.

Auditing the Algorithm: Building a Responsible AI Pipeline That Scales
2025-04-05How we operationalized responsible AI with automated audits, governance rituals, and transparent reporting.

Scheduling: Trading Latency for Throughput (and Back Again)
2025-02-12Queue disciplines, work stealing, and CPU affinity: how scheduler choices shape p50/p99, and when to bias for one over the other.

Exactly-Once in Streaming: What It Means and How Systems Achieve It
2025-01-22Disentangle marketing from mechanisms: idempotence, transactions, and state snapshots behind ‘exactly-once’.

Tuning CUDA with the GPU Memory Hierarchy
2024-11-27Global, shared, and register memory each have distinct latency and bandwidth. Performance comes from the right access pattern.

Write-Ahead Logging: The Unsung Hero of Database Durability
2024-09-10Dive deep into write-ahead logging (WAL), the technique that lets databases promise durability without sacrificing performance. Learn how WAL works, why it matters, and how modern systems push its limits.

Bloom Filters and Probabilistic Data Structures: Trading Certainty for Speed
2024-08-22Explore how Bloom filters, Count-Min sketches, and HyperLogLog sacrifice perfect accuracy for dramatic space and time savings—and learn when that trade-off makes sense.

Seeing in the Dark: Observability for Edge AI Fleets
2024-08-16A practitioner's guide to instrumenting, monitoring, and debugging machine learning models running at the edge.

Adaptive Feature Flag Frameworks for Hyper-Growth SaaS
2024-08-15A comprehensive field guide to building resilient, data-db7735b feature flag platforms that keep hyper-growth SaaS releases safe, fast, and customer-centric.

Lock-Free Data Structures: Concurrency Without the Wait
2024-07-18Explore how lock-free algorithms achieve thread-safe data access without traditional locks. Learn the theory behind compare-and-swap, the ABA problem, memory ordering, and practical implementations that power high-performance systems.

Amdahl’s Law vs. Gustafson’s Law: What They Really Predict
2024-06-15When does parallelism pay off? Compare Amdahl’s and Gustafson’s models, see where each applies, and learn how to reason about speedups in practice.

Unicode and Character Encoding: From ASCII to UTF-8 and Beyond
2024-03-15A comprehensive guide to how computers represent text. Understand the evolution from ASCII through Unicode, the mechanics of UTF-8 encoding, and how to handle text correctly in modern software.

Countdown to Quantum: Migrating an Enterprise to Post-Quantum Cryptography
2024-01-29Practical lessons from a multi-year effort to adopt quantum-safe cryptography without breaking production.

Sealing the Supply Chain: Zero-Trust Build Pipelines That Scale
2023-10-08An engineer’s map for rebuilding the software supply chain around zero-trust principles without stopping delivery.

Memory Allocators: From malloc to Modern Arena Allocators
2023-09-14A deep dive into memory allocation strategies, from the classic malloc implementations to modern arena allocators, jemalloc, tcmalloc, and custom allocators that power high-performance systems.

Reverse Indexing and Inverted Files: How Search Engines Fly
2023-07-19Tokenization, postings lists, skip pointers, and WAND: a tour of the data structures that make full‑text search fast.

Latency-Aware Edge Inference Platforms: Engineering Consistent AI Experiences
2023-03-12A full-stack guide to designing, deploying, and operating low-latency edge inference systems that stay predictable under real-world constraints.