Publications
Underlined names indicate students/researchers advised/co-advised by me.
2026
- ASPLOS 2026MoE-APEX: An Efficient MoE Inference System with Adaptive Precision Expert OffloadingIn International Conference on Architectural Support for Programming Languages and Operating Systems 2026
- AAAI 2026AdaReason: Progressive Training of Multi-LoRA Adapters for Budget-Adaptive Language Reasoning ModelsIn AAAI Conference on Artificial Intelligence 2026
- AAAI 2026DesireKV: Decoupling Sensitivity and Importance for Reasoning-Aware KV Cache CompressionIn AAAI Conference on Artificial Intelligence 2026
2025
- USENIX ATC 2025SpaceExit: Enabling Efficient Adaptive Computing in Space with Early ExitsIn USENIX Annual Technical Conference 2025
- TMC 2025BAT: A Versatile Bipartite Attention-based Approach for Comprehensive Truth Inference in Mobile CrowdsourcingIn Transactions on Mobile Computing 2025
- ASPLOS 2025EXIST: Enabling Extremely Efficient Intra-Service Tracing Observability in DatacentersIn International Conference on Architectural Support for Programming Languages and Operating Systems 2025
- JPDC 2025MMBypass: Towards Efficient Multi-modal AI Computing with Adaptive Bypass NetworkIn Journal of Parallel and Distributed Computing 2025
- FCS 2025FLAPS: fluctuation-aware power auction strategy for reducing the power overload probabilityIn Frontiers of Computer Science 2025
- TACO 2025Enhancing High-Throughput GPU Random Walks Through Multi-Task Concurrency OrchestrationIn ACM Transactions on Architecture and Code Optimization 2025
2024
- TACO 2024Potamoi: Accelerating Neural Rendering via a Unified Streaming ArchitectureIn ACM Transactions on Architecture and Code Optimization 2024
- ICCD 2024 Best Paper NomineeAutoVCoder: A Systematic Framework for Automated Verilog Code Generation using LLMsIn IEEE International Conference on Computer Design 2024
- ICCD 2024 Best Paper NomineeContinuous Energy Efficiency Optimization for Autonomous Embedded Systems Using Shadow CyclesIn IEEE International Conference on Computer Design 2024
- TC 2024Improving Efficiency in Multi-modal Autonomous Embedded Systems through Adaptive GatingIn IEEE Transactions on Computers 2024
- RTSS 2024Jigsaw: Taming BEV-centric Perception on Dual-SoC for Autonomous DrivingIn EEE Real-Time Systems Symposium 2024
- TACO 2024A2: Towards Accelerator Level Parallelism for Autonomous Micromobility SystemsIn ACM Transactions on Architecture and Code Optimization 2024
- SC 2024Boosting Data Center Performance via Intelligently Managed Multi-backend Disaggregated MemoryIn International Conference for High Performance Computing, Networking, Storage, and Analysis 2024
- TPDS 2024WASP: Efficient Power Management Enabling Workload-Aware, Self-Powered AIoT DevicesIn IEEE Transactions on Parallel and Distributed Systems 2024
- IWQoS 2024 Best Paper NomineeCPM: A Cross-layer Power Management Facility to Enable Highly-efficient Real-time AIoT SystemsIn IEEE/ACM International Symposium on Quality of Service 2024
- VLDB 2024FlowWalker: A Memory-efficient and High-performance GPU-based Dynamic Graph Random Walk FrameworkIn Proceedings of the 50th International Conference on Very Large Data Bases 2024
- ISCA 2024 Best Paper NomineeA Tale of Two Domains: Exploring Efficient Architecture Design for Truly Autonomous ThingsIn Proceedings of the 51st Annual International Symposium on Computer Architecture 2024
- ICME 2024M2SN : Adaptive and Dynamic Multi-modal Shortcut Network Architecture for Latency-aware ApplicationsIn IEEE International Conference on Multimedia and Expo 2024
- IPDPS 2024CoCG: Fine-grained Cloud Game Co-location on Heterogeneous PlatformIn Proceedings of the 38th Annual International Symposium on Computer Architecture 2024
- ICDE 2023Graph Contrastive Learning for Truth InferencecIn IEEE International Conference on Data Engineering 2024
2023
- SCIS 2023Power Synchronization: Taming Massive Diversified Serverless Functions under Power ConstraintsIn Science China Information Sciences 2023
- RTSS 2023SMG: A System-level Modality Gating Facility for Fast and Energy-Efficient Multimodal ComputingIn IEEE Real-Time Systems Symposium 2023
- JSAC 2023Practical Network Modeling Using Weak Supervision Signals for Human-Centric Networking in MetaverseIEEE Journal on Selected Areas in Communications, 2023
- SoCC 2023Not All Resources are Visible: Exploiting Fragmented Shadow Resources in Shared-State Scheduler ArchitectureACM Symposium on Cloud Computing, 2023
- ECAI 2023Label Aggregation with Self-Supervision Enhanced Graph TransformerEuropean Conference on Artificial Intelligence, 2023
- IISWC 2022MMBench: Benchmarking End-to-End Multi-modal DNNs and Understanding Their Hardware-Software ImplicationsIEEE International Symposium on Workload Characterization, 2023
- Euro-Par 2023 Best Paper NomineeMMExit: Enabling Fast and Efficient Multi-modal DNN Inference with Adaptive Network ExitsIn European Conference on Parallel Processing, 2023
- ISCA 2023Architecting Efficient Multi-modal AIoT SystemsIn Proceedings of the 50th Annual International Symposium on Computer Architecture 2023
- TC 2023Optimizing GPU-based Graph Sampling and Random Walk for Efficiency and ScalabilityIEEE Transactions on Computers, 2023
- PPoPP 2023CoWalker: High-Throughput GPU Random Walk with Fine-tuned Concurrent Query Processing (Poster)In ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023
2022
- IEEE CAL 2022Characterizing and Understanding End-to-End Multi-modal Neural Networks on GPUsIEEE Computer Architecture Letter, 2022
- IPDPS 2022Enabling Efficient Request Management through Microservice Level ParallelismIn IEEE International Parallel and Distributed Processing Symposium, 2022
- FCS 2022Performance Optimization for Cloud Computing Systems in the Microservice Era: State-of-the-Art and Research OpportunitiesFrontiers of Computer Science 2022
2021
- TC 2021Tapping into NFV Environment for Opportunistic Serverless Edge Function DeploymentIEEE Transactions on Computers, 2021
- ArXiv
- IPDPS 2021AlphaR: Learning-Powered Resource Management for Irregular, Dynamic Microservice GraphIn IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2021
2020
- SC 2020ANT-man: Towards Agile Power Management in the Microservice EraIn International Conference for High Performance Computing, Networking, Storage and Analysis, 2020
- AAAI 2020Fine-Grained Machine Teaching with Attention ModelingIn AAAI Conference on Artificial Intelligence, 2020
- TCC 2020Integrated Power Anomaly Defense: Towards Oversubscription-Safe Data CentersIEEE Transactions on Cloud Computing 2020
2019
- ICPP 2019When Power Oversubscription Meets Traffic Flood Attack: Re-Thinking Data Center Peak Load ManagementIn International Conference on Parallel Processing, 2019
- ICPP 2019Unleashing the Scalability Potential of Power-Constrained Data Center in the Microservice EraIn International Conference on Parallel Processing, 2019
2018
- ICCD 2018 Best Paper AwardPower Grab in Aggressively Provisioned Data Centers: What is the Risk and What Can Be Done About ItIn International Conference on Computer Design, 2018
2016
- ISCA 2016Power Attack Defense: Securing Battery-Backed Data CentersInternational Symposium on Computer Architecture, 2016
- JCRD 2016Green Hierarchical Management for Distributed Datacenter ContainersJournal of Computer Research and Development, 2016