Blog
Posts, notes, and articles.

The Performance Of Database Caching Strategies: Lru, Clock, Arc, And 2Q Under Real World Workloads
2021-07-06A comprehensive technical exploration of the performance of database caching strategies: lru, clock, arc, and 2q under real world workloads, covering key concepts, practical implementations, and real-world applications.

Columnar Storage: Parquet Encoding, ORC Stripe Format, Apache Arrow In-Memory Columnar Format, Predicate Pushdown, and SIMD Scans
2021-07-03A deep exploration of columnar data formats — how Parquet and ORC organize data column-by-column for efficient analytics, Apache Arrow's in-memory representation for zero-copy data interchange, and the vectorized execution that makes modern query engines fast.

CPU Caches and Memory Hierarchy: The Hidden Architecture Behind Performance
2021-06-22A deep exploration of CPU cache architecture, from L1 to L3 caches, cache lines, associativity, replacement policies, and cache coherence. Learn how memory hierarchy shapes modern software performance.

Object Storage: RADOS/Ceph Architecture, the CRUSH Placement Algorithm, S3 API Semantics, and Erasure Coding at Scale
2021-06-21A deep exploration of object storage — how Ceph's RADOS and CRUSH algorithm enable scalable, self-managing storage clusters, the S3 API's influence on cloud storage, and how erasure coding reduces storage overhead.

Building A Distributed Log Structured Storage Engine: Wiredtiger’S B Tree And Concurrency Control
2021-06-19A comprehensive technical exploration of building a distributed log structured storage engine: wiredtiger’s b tree and concurrency control, covering key concepts, practical implementations, and real-world applications.

Distributed File Systems: GFS Design, HDFS Architecture, the Colossus Evolution, and Single-Master Metadata Bottlenecks
2021-06-18A deep exploration of distributed file systems — how Google's GFS pioneered the single-master model, how HDFS adapted it for the Hadoop ecosystem, and how modern systems have evolved beyond the single-master bottleneck.

Persistent Memory Programming: DAX Mappings, PMDK Libraries, Crash Consistency Without Write-Ahead Logging, and the Optane Legacy
2021-06-14A deep exploration of persistent memory — how DAX enables direct byte-addressable access to non-volatile memory, how the PMDK libraries solve the crash consistency problem at the instruction level, and the lessons of Intel Optane.

A Deep Dive Into The R Tree Spatial Index: Guttman’S Algorithm, Node Splitting, And R* Tree Variants
2021-06-13A comprehensive technical exploration of a deep dive into the r tree spatial index: guttman’s algorithm, node splitting, and r* tree variants, covering key concepts, practical implementations, and real-world applications.

The Implementation Of A Columnar Storage Format: Parquet Compression, Dictionary Encoding, And Row Groups
2021-06-13A comprehensive technical exploration of the implementation of a columnar storage format: parquet compression, dictionary encoding, and row groups, covering key concepts, practical implementations, and real-world applications.

Designing A Graph Database With Native Storage: Adjacency Lists, Property Graphs, And Traversal Optimization
2021-06-02A comprehensive technical exploration of designing a graph database with native storage: adjacency lists, property graphs, and traversal optimization, covering key concepts, practical implementations, and real-world applications.

NVMe and the Storage Stack: The NVMe Command Set, Submission/Completion Queues, SPDK, and the Death of the SCSI/SATA Bottleneck
2021-05-31A deep exploration of NVMe technology — how the command set and queue model eliminate the SCSI bottleneck, and why user-space storage via SPDK achieves microsecond-latency I/O on commodity flash.

Instrumenting Without Spying: Privacy-Preserving Telemetry at Scale
2021-05-27How we rebuilt our telemetry pipeline to respect user privacy without sacrificing insight.

A Detailed Analysis Of The Pagerank Algorithm: Power Iteration, Damping Factor, And Personalization
2021-05-25A comprehensive technical exploration of a detailed analysis of the pagerank algorithm: power iteration, damping factor, and personalization, covering key concepts, practical implementations, and real-world applications.

Implementing A K D Tree For Nearest Neighbor Search With Balanced Construction And Bounded Box Test
2021-05-18A comprehensive technical exploration of implementing a k d tree for nearest neighbor search with balanced construction and bounded box test, covering key concepts, practical implementations, and real-world applications.

The Theory Of Generalization Error In Support Vector Machines: Vc Dimension And Maximal Margin Classifiers
2021-05-17A comprehensive technical exploration of the theory of generalization error in support vector machines: vc dimension and maximal margin classifiers, covering key concepts, practical implementations, and real-world applications.

User-Space Networking: Snabb Switch, FD.io VPP (Vector Packet Processing), AF_XDP, and the Philosophy of Kernel Bypass
2021-05-14A deep exploration of user-space networking — how Snabb, VPP, and AF_XDP achieve line-rate packet processing by bypassing the kernel, and the architectural trade-offs of moving the network data plane into user space.

eBPF Internals: The In-Kernel Verifier, Safety Proofs, JIT Compilation to Native Code, Map Types, and XDP/TC Hooks
2021-05-08A deep exploration of eBPF internals — how the Linux kernel verifier proves safety, the JIT compilers that turn BPF bytecode into native instructions, the map infrastructure that enables stateful processing, and the XDP/TC hooks that make programmable networking possible.

Building A Distributed Machine Learning System: Parameter Server Architecture With Asynchronous Stochastic Gradient Descent
2021-05-05A comprehensive technical exploration of building a distributed machine learning system: parameter server architecture with asynchronous stochastic gradient descent, covering key concepts, practical implementations, and real-world applications.

The Performance Of Attention Mechanisms In Transformers: Self Attention Vs. Multi Headed With Flashattention Optimization
2021-05-01A comprehensive technical exploration of the performance of attention mechanisms in transformers: self attention vs. multi headed with flashattention optimization, covering key concepts, practical implementations, and real-world applications.

A Comprehensive Guide To Quantization Aware Training: Simulated Quantization, Straight Through Estimator, And Calibration
2021-04-30A comprehensive technical exploration of a comprehensive guide to quantization aware training: simulated quantization, straight through estimator, and calibration, covering key concepts, practical implementations, and real-world applications.

Implementing A Neural Network Training Framework With Automatic Differentiation Using Wengert Lists
2021-04-29A comprehensive technical exploration of implementing a neural network training framework with automatic differentiation using wengert lists, covering key concepts, practical implementations, and real-world applications.

Deterministic Monorepo CI Platforms: Engineering Consistency at Scale
2021-04-23A deep guide to building, operating, and evolving reproducible CI/CD systems for large monorepos without sacrificing developer velocity or safety.

System Calls: The Gateway Between User Space and Kernel
2021-04-18An in-depth exploration of how applications communicate with the operating system kernel through system calls. Learn about the syscall interface, context switching, and how modern OSes balance security with performance.

The Mathematics Of Gaussian Processes For Bayesian Optimization: Kernel Selection And Cholesky Factorization
2021-03-26A comprehensive technical exploration of the mathematics of gaussian processes for bayesian optimization: kernel selection and cholesky factorization, covering key concepts, practical implementations, and real-world applications.