Writing

Notes on systems & speed

Deep dives into Rust experiments, Apache DataFusion internals, and the mechanics behind real speedups — with benchmarks and flamegraphs, not vibes.

{ } DataFusion

Jun 20, 2026

Inside a DataFusion Parquet Scan: Skipping Page Index I/O When Statistics Already Decide

How a Parquet scan reads a file end to end — footer, row groups, page index, bloom filters — and why PR #22857 stops loading page index metadata when row-group statistics already prove the filter.
Read →
⌗ Trino

May 29, 2026 External ↗

How a Deadlock Froze Blinkit's Supply Chain

A silent deadlock in our query engine stalled inventory replenishment with no error, no crash — just infinite waiting. How we traced it to a shared thread pool in Trino's Hudi connector and fixed it upstream.
Read on Lambda by Blinkit →
{ } DataFusion

Apr 19, 2026

Zero-Copy Strings in Apache DataFusion: How StringViewArray Boosted Performance by 8%

How StringViewArray cut the copy tax on string operations and lifted ClickBench performance by 8%.
Read →
~ Async

Mar 10, 2026

Async Runtimes vs Threads in Rust: Which Is Better, and When?

Tokio wins on tiny and waiting-heavy workloads; threads catch up on pure CPU. A measured guide to when each model fits.
Read →
⇄ Concurrency

Mar 04, 2026

Atomics vs Mutex in Rust: Why Mutex Won Under Heavy Contention

Why a mutex beat atomics under heavy contention — with flamegraphs and a counterintuitive takeaway.
Read →
⧉ CPU

Feb 28, 2026

The Hidden Performance Killer: How 56 Bytes of Padding Made My Rust Code 4.6x Faster

How 56 bytes of padding turned a 749ms benchmark into 163ms — the hidden cost of cache-line false sharing.
Read →

Notes on systems & speed

Inside a DataFusion Parquet Scan: Skipping Page Index I/O When Statistics Already Decide

How a Deadlock Froze Blinkit's Supply Chain

Zero-Copy Strings in Apache DataFusion: How StringViewArray Boosted Performance by 8%

Async Runtimes vs Threads in Rust: Which Is Better, and When?

Atomics vs Mutex in Rust: Why Mutex Won Under Heavy Contention

The Hidden Performance Killer: How 56 Bytes of Padding Made My Rust Code 4.6x Faster