Apache-Arrow
- Columnar Storage: Parquet Encoding, ORC Stripe Format, Apache Arrow In-Memory Columnar Format, Predicate Pushdown, and SIMD Scans
· 2021-07-03
A deep exploration of columnar data formats — how Parquet and ORC organize data column-by-column for efficient analytics, Apache Arrow's in-memory representation for zero-copy data interchange, and the vectorized execution that makes modern query engines fast.