Xiaofeng Hou

I am an assistant professor in the Department of Computer Science and Engineering, Shanghai Jiao Tong University (SJTU). At SJTU, I’m a member of Emerging Parallel Computing Center (EPCC) and Sustainable Architecture and Intelligence Laboratory (SAIL). Prior to joining SJTU, I worked with Prof. Kwang-Ting CHENG at the AI Chip Center for Emerging Smart Systems (ACCESS), Hong Kong University of Science and Technology (HKUST). I earned my PhD degree from Shanghai Jiao Tong University under the joint supervision of Prof. Chao Li and Prof. Minyi Guo. I earned my BS degree from Dalian University of Technology.

My research addresses the critical computing challenges in the era of AI. I specialize in architecture-system co-designs for highly-efficient intelligent computing. My work spans from autonomous edge devices to hyperscale datacenters, with a focus on:

Efficient Multi-Modal LLM Serving: Developing novel hardware and software solutions to reduce the latency and energy costs of large model inference.

Automated Architecture/System Designs: Using automated methods to discover and implement optimal computer architectures and systems for emerging applications.

【诚招】博士/硕士/本科实习生 — 共同打造下一代高效智能计算系统

我的课题组专注于高效能、可持续的AI计算,通过软硬件协同设计,解决大模型时代的算力与能耗挑战。课题组长期招收博士生、硕士生和本科实习生。目前有如下多个科研课题:

  1. 多模态模型推理加速: 让文生图/视频模型跑得更快、更省电;
  2. 稀疏混合专家MOE大模型系统优化: 突破显存墙,高效服务万亿参数的稀疏大模型;
  3. 边缘与端侧推理加速技术: 在汽车、卫星等边缘场景实现低功耗的本地AI;
  4. 大模型微服务化系统: 构建灵活、高可用的分布式AI服务系统。

感兴趣的同学欢迎发邮件(hou-xf at cs.sjtu.edu.cn)与我交流。

News

Nov 9, 2025 Two papers are accepted by AAAI 2026 (The 40th Annual AAAI Conference on Artificial Intelligence)! Preprint coming soon.
Oct 27, 2025 Paper MoE-APEX: An Efficient MoE Inference System with Adaptive Precision Expert Offloading accepted to ASPLOS 2026 (The ACM International Conference on Architectural Support for Programming Languages and Operating Systems,)! Preprint coming soon.
Apr 27, 2025 Paper SpaceExit: Enabling Efficient Adaptive Computing in Space with Early Exits accepted to USENIX ATC 2025 (The 2025 USENIX Annual Technical Conference)! Preprint coming soon.
Jan 29, 2025 Paper EXIST: Enabling Extremely Efficient Intra-Service Tracing Observability in Datacenters accepted to ASPLOS 2025 (The 2025 International Conference on Architectural Support for Programming Languages and Operating Systems)! Preprint coming soon.
Oct 21, 2024 Two papers were selected for the Best Paper Nominees in ICCD 2024 (International Conference on Computer Design)! Preprint coming soon.
Aug 21, 2024 TACO and RTSS have accepted our work on system optimizations for heterogeneous autonomous driving platforms! Preprint coming soon.
Jul 11, 2024 Paper A Tale of Two Domains: Exploring Efficient Architecture Design for Truly Autonomous Things was selected for the Best Paper Session in ISCA 2024 (International Symposium on Computer Architecture)! Preprint coming soon.
Jun 22, 2024 Paper CPM: A Cross-layer Power Management Facility to Enable Highly-efficient Real-time AIoT System received the Best Paper Honorable Mention Award in IWQoS 2024 (IEEE/ACM International Symposium on Quality of Service)! Preprint coming soon.
Mar 23, 2024 We will host the First International Workshop on Acceleration and Optimization of Multi-modal Computing (AOMC 2024) @Co-located with ISCA 2024 at Buenos Aires, Argentina, June 2024.
Mar 21, 2024 Paper A Tale of Two Domains: Exploring Efficient Architecture Design for Truly Autonomous Things accepted to ISCA (The 51st International Symposium on Computer Architecture)! Preprint coming soon.
Sep 29, 2023 Paper SMG: A System-level Modality Gating Facility for Fast and Energy-Efficient Multimodal Computing accepted to RTSS (IEEE Real-Time Systems Symposium)! Preprint coming soon.
Sep 11, 2023 The project code and tutorials for MMBench are now available!.
Aug 22, 2023 Paper MMBench: Benchmarking End-to-End Multi-modal DNNs and Understanding Their Hardware-Software Implications accepted to the top conference on workload characterization IISWC (IEEE International Symposium on Workload Characterization).
May 4, 2023 Paper MMExit: Enabling Fast and Efficient Multi-modal DNN Inference with Adaptive Network Exits accepted as one of Best Paper Nominees (4/164)!.
May 1, 2023 Paper MMExit: Enabling Fast and Efficient Multi-modal DNN Inference with Adaptive Network Exits accepted to Euro-Par (The International European Conference on Parallel and Distributed Computing)! Preprint coming soon.
Mar 11, 2023 Paper Architecting Efficient Multi-modal AIoT Systems accepted to ISCA (The 50th International Symposium on Computer Architecture)! Preprint coming soon.
Oct 19, 2022 Paper Characterizing and Understanding End-to-End Multi-modal Neural Networks on GPUs accepted to IEEE CAL (The IEEE Computer Architecture Letters)! Preprint coming soon.
Mar 12, 2022 Paper Enabling Efficient Request Management through Microservice Level Parallelism accepted to IPDPS (The IEEE International Parallel and Distributed Processing Symposium)! Preprint coming soon.
Nov 26, 2021 DataCLUE covered on the AINLP!
Nov 18, 2021 Preprint of DataCLUE, a benchmark suite for data-centeric NLP, is now availiable on Arxiv! Project code coming soon!
Jan 19, 2021 I have joined ACCESS as a Post-doctoral Fellow, working with Prof. CHENG!
Aug 25, 2020 I passed my PhD defense! Preprint of the dissertation coming soon. :sparkles: :smile:
Aug 21, 2020 Paper ANT-Man: Towards Agile Power Management in the Microservice Era accepted to SC (The International Conference for High Performance Computing, Networking, Storage, and Analysis)! Preprint coming soon.
Oct 2, 2019 I am a Visiting Scholar as a Junior Specialist in University of California Riverside, advised by Prof. Shaolei Ren.
Aug 9, 2018 Paper Power Grab in Aggressively Provisioned Data Centers: What is the Risk and What Can Be Done About It received the Best Paper Award (4/264) in ICCD (International Conference on Computer Design)! Preprint coming soon.

Selected Publications

  1. ASPLOS 2026
    MoE-APEX: An Efficient MoE Inference System with Adaptive Precision Expert Offloading
    Peng Tang, Jiacheng Liu, Xiaofeng Hou, Yifei Pu, Jing Wang, Pheng-Ann Heng, Chao Li, and Guo.
    In International Conference on Architectural Support for Programming Languages and Operating Systems 2026
  2. USENIX ATC 2025
    SpaceExit: Enabling Efficient Adaptive Computing in Space with Early Exits
    Jiacheng Liu, Xiaozhi Zhu, Tongqiao Xu, Xiaofeng Hou, and Chao Li.
    In USENIX Annual Technical Conference 2025
  3. ASPLOS 2025
    EXIST: Enabling Extremely Efficient Intra-Service Tracing Observability in Datacenters
    Xinkai Wang, Xiaofeng Hou, Chao Li, Yuancheng Li, Du Liu, Guoyao Xu, Guodong Yang, Liping Zhang, Yuemin Wu, Xiaopeng Yuan, Quan Chen, and Minyi Guo.
    In International Conference on Architectural Support for Programming Languages and Operating Systems 2025
  4. ICCD 2024 Best Paper Nominee
    AutoVCoder: A Systematic Framework for Automated Verilog Code Generation using LLMs
    Mingzhe Gao, Jieru Zhao, Zhe Lin, Wenchao Ding, Xiaofeng Hou, Yu Feng, Chao Li, and Minyi Guo
    In IEEE International Conference on Computer Design 2024
  5. ICCD 2024 Best Paper Nominee
    Continuous Energy Efficiency Optimization for Autonomous Embedded Systems Using Shadow Cycles
    Xinkai Wang, Chao Li, Lingyu Sun, Qizheng Lv, Xiaofeng Hou, Jingwen Leng, and Minyi Guo
    In IEEE International Conference on Computer Design 2024
  6. TC 2024
    Improving Efficiency in Multi-modal Autonomous Embedded Systems through Adaptive Gating
    Xiaofeng Hou, Cheng Xu, Chao Li, Jiacheng Liu, Xuehan Tang, Kwang-ting Cheng, and Minyi Guo
    In IEEE Transactions on Computers 2024
  7. RTSS 2024
    Jigsaw: Taming BEV-centric Perception on Dual-SoC for Autonomous Driving
    Lingyu Sun, Chao Li, Xiaofeng Hou, Tianhao Huang, Cheng Xu, Xinkai Wang, and Guangjun Bao
    In EEE Real-Time Systems Symposium 2024
  8. TACO 2024
    A2: Towards Accelerator Level Parallelism for Autonomous Micromobility Systems
    Lingyu Sun, Xiaofeng Hou, Chao Li, Jiacheng Liu, Xinkai Wang, Quan Chen, and Minyi Guo
    In ACM Transactions on Architecture and Code Optimization 2024
  9. TPDS 2024
    WASP: Efficient Power Management Enabling Workload-Aware, Self-Powered AIoT Devices
    Xiaofeng Hou, Xuehan Tang, Jiacheng Liu, Chao Li, Luhong Liang, and Kwang-ting Cheng
    In IEEE Transactions on Parallel and Distributed Systems 2024
  10. IWQoS 2024 Best Paper Nominee
    CPM: A Cross-layer Power Management Facility to Enable Highly-efficient Real-time AIoT Systems
    Xiaofeng Hou, Peng Tang, Tongqiao Xu, Cheng Xu, Chao Li, and Minyi Guo
    In IEEE/ACM International Symposium on Quality of Service 2024
  11. ISCA 2024 Best Paper Nominee
    A Tale of Two Domains: Exploring Efficient Architecture Design for Truly Autonomous Things
    Xiaofeng Hou*, Tongqiao Xu*, Chao Li, Cheng Xu, Jiacheng Liu, Yang Hu, Jieru Zhao, Jingwen Leng, Kwang-ting Cheng, and Minyi Guo
    In Proceedings of the 51st Annual International Symposium on Computer Architecture 2024
  12. RTSS 2023
    SMG: A System-level Modality Gating Facility for Fast and Energy-Efficient Multimodal Computing
    Xiaofeng Hou, Peng Tang, Chao Li, Jiacheng Liu, Cheng Xu, Kwang-Ting Cheng, and Minyi Guo
    In IEEE Real-Time Systems Symposium 2023
  13. Euro-Par 2023 Best Paper Nominee
    MMExit: Enabling Fast and Efficient Multi-modal DNN Inference with Adaptive Network Exits
    Xiaofeng Hou, Jiacheng Liu, Xuehan Tang, Chao Li, Kwang-Ting Cheng, Li Li, and Minyi Guo
    In European Conference on Parallel Processing, 2023
  14. ISCA 2023
    Architecting Efficient Multi-modal AIoT Systems
    Xiaofeng Hou, Jiacheng Liu, Xuehan Tang, Chao Li, Jia Chen, Luhong Liang, Kwang-Ting Cheng, and Minyi Guo
    In Proceedings of the 50th Annual International Symposium on Computer Architecture 2023
  15. SC 2020
    ANT-man: Towards Agile Power Management in the Microservice Era
    Xiaofeng Hou, Chao Li, Jiacheng Liu, Lu Zhang, Yang Hu, and Minyi Guo
    In International Conference for High Performance Computing, Networking, Storage and Analysis, 2020
  16. ICCD 2018 Best Paper Award
    Power Grab in Aggressively Provisioned Data Centers: What is the Risk and What Can Be Done About It
    Xiaofeng Hou, Luoyao Hao, Chao Li, Quan Chen, Wenli Zheng, and Minyi Guo
    In International Conference on Computer Design, 2018
  17. ISCA 2016
    Power Attack Defense: Securing Battery-Backed Data Centers
    Chao Li, Zhenhua Wang*, Xiaofeng Hou*, Haopeng Chen, Xiaoyao Liang, and Minyi Guo
    International Symposium on Computer Architecture, 2016