Memory
- Memory Allocation and Garbage Collection: How Programs Manage Memory
· 2025-02-20
A deep dive into how programming languages allocate, track, and reclaim memory. Understand malloc internals, garbage collection algorithms, and the trade-offs that shape runtime performance.
- Tuning CUDA with the GPU Memory Hierarchy
· 2024-11-27
Global, shared, and register memory each have distinct latency and bandwidth. Performance comes from the right access pattern.
- Memory Allocators: From malloc to Modern Arena Allocators
· 2023-09-14
A deep dive into memory allocation strategies, from the classic malloc implementations to modern arena allocators, jemalloc, tcmalloc, and custom allocators that power high-performance systems.
- Garbage Collection Algorithms: From Mark-and-Sweep to ZGC
· 2022-11-22
A comprehensive exploration of garbage collection algorithms, from classic mark-and-sweep to modern concurrent collectors like G1, Shenandoah, and ZGC. Learn how automatic memory management works and the trade-offs that shape collector design.
- CPU Caches and Cache Coherence: The Memory Hierarchy That Makes Modern Computing Fast
· 2022-07-12
A comprehensive exploration of how CPU caches bridge the processor-memory speed gap. Learn about cache architecture, replacement policies, coherence protocols, and how to write cache-friendly code for maximum performance.
- Virtual Memory and Page Tables: How Operating Systems Manage Memory
· 2021-08-12
A comprehensive exploration of virtual memory systems, page tables, address translation, and the hardware-software collaboration that enables modern multitasking. Understand TLBs, page faults, and memory protection.
- Cache‑Friendly Data Layouts: AoS vs. SoA (and the Hybrid In‑Between)
· 2021-03-18
How memory layout choices shape the performance of your hot loops. A practical guide to arrays‑of‑structs, struct‑of‑arrays, and hybrid layouts across CPUs and GPUs.
- Speculative Prefetchers: Designing Memory Systems That Read the Future
· 2019-02-14
A field guide to building and validating speculative memory prefetchers that anticipate demand in modern CPUs and data platforms.