Announcement_25
Paper MoE-APEX: An Efficient MoE Inference System with Adaptive Precision Expert Offloading accepted to ASPLOS 2026 (The ACM International Conference on Architectural Support for Programming Languages and Operating Systems,)! Preprint coming soon.