Blog
Posts, notes, and articles.
- 0 - /blog/learned-indexes-when-models-replace-btrees/ (section=blog)
- 1 - /blog/the-100microsecond-rule-why-tail-latency-eats-your-throughput-and-how-to-fight-back/ (section=blog)
- 2 - /blog/the-quiet-calculus-of-probabilistic-commutativity/ (section=blog)
- 3 - /blog/the-hidden-backbone-of-parallelism-how-prefix-sums-power-distributed-computation/ (section=blog)
- 4 - /blog/gpudirect-storage-in-2025-optimizing-the-end-to-end-data-path/ (section=blog)
- 5 - /blog/mpi-vs.-openmp-in-2025-where-each-wins/ (section=blog)
- 6 - /blog/from-mapreduce-to-spark-the-arc-of-data-parallel-systems/ (section=blog)
- 7 - /blog/auditing-the-algorithm-building-a-responsible-ai-pipeline-that-scales/ (section=blog)
- 8 - /blog/scheduling-trading-latency-for-throughput-and-back-again/ (section=blog)
- 9 - /blog/exactly-once-in-streaming-what-it-means-and-how-systems-achieve-it/ (section=blog)
- 10 - /blog/tuning-cuda-with-the-gpu-memory-hierarchy/ (section=blog)
- 11 - /blog/seeing-in-the-dark-observability-for-edge-ai-fleets/ (section=blog)
- 12 - /blog/amdahls-law-vs.-gustafsons-law-what-they-really-predict/ (section=blog)
- 13 - /blog/countdown-to-quantum-migrating-an-enterprise-to-post-quantum-cryptography/ (section=blog)
- 14 - /blog/sealing-the-supply-chain-zero-trust-build-pipelines-that-scale/ (section=blog)
- 15 - /blog/reverse-indexing-and-inverted-files-how-search-engines-fly/ (section=blog)
- 16 - /blog/keeping-the-model-awake-building-a-self-healing-ml-inference-platform/ (section=blog)
- 17 - /blog/timeouts-retries-and-idempotency-keys-a-practical-guide/ (section=blog)
- 18 - /blog/teaching-graphql-to-cache-at-the-edge/ (section=blog)
- 19 - /blog/instrumenting-without-spying-privacy-preserving-telemetry-at-scale/ (section=blog)
- 20 - /blog/cachefriendly-data-layouts-aos-vs.-soa-and-the-hybrid-inbetween/ (section=blog)
- 21 - /blog/raft-fastcommit-and-prevote-in-practice/ (section=blog)
- 22 - /blog/merkle-trees-and-contentaddressable-storage/ (section=blog)
- 23 - /blog/tuning-the-dial-adaptive-consistency-at-planet-scale/ (section=blog)
- 24 - /blog/when-data-centers-learned-to-sleep-energy-aware-scheduling-in-practice/ (section=blog)
- 25 - /blog/speculative-prefetchers-designing-memory-systems-that-read-the-future/ (section=blog)
Learned Indexes: When Models Replace B‑Trees
2025-10-04A practitioner's guide to learned indexes: how they work, when they beat classic data structures, and what it takes to ship them without getting paged.
The 100‑Microsecond Rule: Why Tail Latency Eats Your Throughput (and How to Fight Back)
2025-10-04A field guide to taming P99 in modern systems—from queueing math to NIC interrupts, from hedged requests to adaptive concurrency. Practical patterns, pitfalls, and a blueprint you can apply this week.
The Quiet Calculus of Probabilistic Commutativity
2025-09-27A practical calculus for quantifying when non-commutative operations in distributed systems can be safely executed without heavyweight coordination.
The Hidden Backbone of Parallelism: How Prefix Sums Power Distributed Computation
2025-09-21Discover how the humble prefix sum (scan) quietly powers GPUs, distributed clusters, and big data frameworks—an obscure but essential building block of parallel and distributed computation.
GPUDirect Storage in 2025: Optimizing the End-to-End Data Path
2025-09-16How modern systems move data from NVMe and object storage into GPU kernels with minimal CPU overhead and maximal throughput.
MPI vs. OpenMP in 2025: Where Each Wins
2025-07-04A practical guide to choosing between message passing and shared-memory parallelism for modern HPC and hybrid nodes.
From MapReduce to Spark: The Arc of Data-Parallel Systems
2025-05-19MapReduce taught fault-tolerant batch at scale; Spark generalized it with resilient distributed datasets (RDDs) and DAG scheduling.
Auditing the Algorithm: Building a Responsible AI Pipeline That Scales
2025-04-05How we operationalized responsible AI with automated audits, governance rituals, and transparent reporting.
Scheduling: Trading Latency for Throughput (and Back Again)
2025-02-12Queue disciplines, work stealing, and CPU affinity: how scheduler choices shape p50/p99, and when to bias for one over the other.
Exactly-Once in Streaming: What It Means and How Systems Achieve It
2025-01-22Disentangle marketing from mechanisms: idempotence, transactions, and state snapshots behind ‘exactly-once’.
Tuning CUDA with the GPU Memory Hierarchy
2024-11-27Global, shared, and register memory each have distinct latency and bandwidth. Performance comes from the right access pattern.
Seeing in the Dark: Observability for Edge AI Fleets
2024-08-16A practitioner's guide to instrumenting, monitoring, and debugging machine learning models running at the edge.
Amdahl’s Law vs. Gustafson’s Law: What They Really Predict
2024-06-15When does parallelism pay off? Compare Amdahl’s and Gustafson’s models, see where each applies, and learn how to reason about speedups in practice.
Countdown to Quantum: Migrating an Enterprise to Post-Quantum Cryptography
2024-01-29Practical lessons from a multi-year effort to adopt quantum-safe cryptography without breaking production.
Sealing the Supply Chain: Zero-Trust Build Pipelines That Scale
2023-10-08An engineer’s map for rebuilding the software supply chain around zero-trust principles without stopping delivery.
Reverse Indexing and Inverted Files: How Search Engines Fly
2023-07-19Tokenization, postings lists, skip pointers, and WAND: a tour of the data structures that make full‑text search fast.
Keeping the Model Awake: Building a Self-Healing ML Inference Platform
2023-02-14A field report on taming production machine learning inference with proactive healing, adaptive scaling, and human empathy.
Timeouts, Retries, and Idempotency Keys: A Practical Guide
2022-09-08Make your distributed calls safe under partial failure. How to budget timeouts, avoid retry storms, and use idempotency keys without shooting yourself in the foot.
Teaching GraphQL to Cache at the Edge
2022-09-03A deep dive into making GraphQL play nicely with edge caches without breaking declarative APIs.
Instrumenting Without Spying: Privacy-Preserving Telemetry at Scale
2021-05-27How we rebuilt our telemetry pipeline to respect user privacy without sacrificing insight.
Cache‑Friendly Data Layouts: AoS vs. SoA (and the Hybrid In‑Between)
2021-03-18How memory layout choices shape the performance of your hot loops. A practical guide to arrays‑of‑structs, struct‑of‑arrays, and hybrid layouts across CPUs and GPUs.
Raft Fast‑Commit and PreVote in Practice
2020-11-09What fast‑commit and PreVote actually change in Raft, how they affect availability during leader changes, and where the footguns are.
Merkle Trees and Content‑Addressable Storage
2020-08-17From Git to distributed object stores: how Merkle DAGs enable integrity, deduplication, and efficient sync.
Tuning the Dial: Adaptive Consistency at Planet Scale
2020-03-11Inside the engineering of databases that adjust consistency on the fly without breaking user trust.