2026 | Lucas Sangdae Nam

Mar 09, 2026	Cross-Instance KV Cache Sharing for Disaggregated LLM Serving: Cutting TTFT with Mooncake and LMCache
Mar 04, 2026	NIXL for KV Cache in Disaggregated Serving
Feb 28, 2026	CUDA Graph in vLLM: Eliminating CPU Overhead in LLM Inference
Feb 22, 2026	Multi-Node P/D Disagg vLLM Serving: How EFA Works Compared to InfiniBand?
Jan 30, 2026	MoE Expert FFN Backend: experts_implementation