Engineering Writing
Engineering Insights
Sharing insights, lessons, and experiences from software engineering, distributed systems, and AI. These articles are shaped by the systems I've built, the challenges I've faced, and the research I've conducted.
Lessons Learned Building Distributed Systems for 600TB+ Analytical Workloads
My practical lessons from building and operating 600TB+ analytics pipelines, including failures, trade-offs, and what I would design differently now.
distributed-systems
data-infrastructure
dask
Operating Kubernetes Workloads Across 300+ Production Nodes: What Nobody Tells You
The operational realities I faced running Kubernetes across 300+ nodes: rollout risk, scheduling pressure, workload isolation, and recovery discipline.
kubernetes
platform-engineering
sre
Why Observability Becomes More Important Than Code at Scale
Why I now treat observability as a core engineering capability: faster diagnosis, better reliability, and much higher confidence during incidents.
observability
prometheus
grafana