HCW 2025
The thirty-fourth Heterogeneity in Computing Workshop (HCW) was held at the Politecnico di Milano, Milan, Italy, on June 4, 2025. HCW is annually organized in conjunction with the International Parallel and Distributed Processing Symposium (IPDPS).
Heterogeneous computing systems comprise growing numbers of increasingly more diverse computing resources that can be local to one another or geographically distributed. The opportunity and need for effectively utilizing heterogeneous computing resources has given rise to the notions of cluster computing, grid computing, and cloud computing. HCW encourages paper submissions from both the research and industry communities presenting novel ideas on theoretical and practical aspects of computing in heterogeneous computing environments.

HCW 2025 Program
Session 1: Introductions and Keynote Presentation (9:15-10:30 am)
Session Chairs: Ali Akoglu (University of Arizona, US) and Kamesh Madduri (Pennsylvania State University, US)
9:15 am
Welcome and Introductions, HCW 2025 Overview
9:30 am
Jeffrey Vetter, Section Head - Advanced Computing Systems Research at Oak Ridge National Laboratory, delivered the HCW 2025 keynote.
Title: Navigating the Post-Exascale Computing Era: GPUs, Analog Computing, and AI
Abstract: DOE has deployed its first three Exascale systems, so now is an appropriate time to think about
post-Exascale challenges and opportunities. GPUs were just the beginning of architectural disruption.
Focusing on both performance and energy efficiency, we are seeing a wide array of new technologies
emerge during this `golden age of architectures,’ making the choices of architectures, software, and
applications existential. In this talk, I will survey post-Exascale technologies and discuss their implications
for both system design and software. As an example, I will delve into analog computing as one alternative
to dramatically improve energy efficiency. Meanwhile, the extraordinary disruption offered by AI may
provide ways to mitigate these software and algorithmic challenges. Our team is exploring how AI can be
used to develop software for new architectures and transform legacy programming languages to
contemporary programming systems.
Break (10:30-11 am)
Session 2: Research Papers (11 am-12:30 pm)
Session Chair: DK Panda (The Ohio State University, US)
11 am
Improving energy efficiency of HPC applications using unbalanced GPU power capping
Albert d’Aviau de Piolant (University of Bordeaux, FR), Hayfa Tayeb (University of Bordeaux and University of Strasbourg, FR), Berenger Bramas (University of Strasbourg, FR), Mathieu Faverge (University of Bordeaux, FR), Abdou Guermouche (University of Bordeaux, FR), Amina Guermouche (University of Bordeaux, FR)
11:20 am
Methodology for GPU Frequency Switching Latency Measurement
Daniel Velicka (Technical University of Ostrava, CZ), Ondrej Vysocky (Technical University of Ostrava, CZ), Lubomir Riha (Technical University of Ostrava, CZ)
11:40 am
LM-Offload: Performance Model-Guided Generative Inference of Large Language Models with Parallelism Control
Jianbo Wu (University of California, Merced, US), Jie Ren (William & Mary, US), Shuangyan Yang (University of California, Merced, US), Konstantinos Parasyris (Lawrence Livermore National Laboratory, US), Giorgis Georgakoudis (Lawrence Livermore National Laboratory, US), Ignacio Laguna (Lawrence Livermore National Laboratory, US), Dong Li (University of California, Merced, US)
12 pm
Millions of Matrix-Multiplications: GEMM Variations on Aurora
Colleen Bertoni (Argonne National Laboratory, US), Thomas Applencourt (Argonne National Laboratory, US), Longfei Gao (Argonne National Laboratory, US), Ti Leggett (Argonne National Laboratory, US)
12:10 pm
Leveraging Interaction Between Memory Footprint and Parallelism Degree for efficient GPU Portings
Michael Boichot (Institut Polytechnique de Paris, FR), Adrien Rouseel (CEA, FR), Elisabeth Brunet (Institut Polytechnique de Paris, FR), Patrick Carribault (CEA, FR)
12:20 pm
HaaS - A Platform for Password Cracking in Distributed Heterogeneous Systems
Carlos Lima (Polytechnic Institute of Bragança, PT), Rui Alves (Polytechnic Institute of Bragança, PT), José Rufino (Polytechnic Institute of Bragança, PT)
Lunch break (12:30-2 pm)
Session 3: Research Papers (2-3:30 pm)
Session Chair: Kamesh Madduri (Pennsylvania State University, US)
2 pm
Static task mapping for heterogeneous systems based on series-parallel decompositions
Martin Wilhelm (Otto-von-Guericke University, DE), Thilo Pionteck (Otto-von-Guericke University, DE)
2:20 pm
On the Usability and Energy Efficiency of High-Level Synthesis for FPGA-based Network-Attached Accelerators
Steffen Christgau (Zuse Institute Berlin, DE), Dylan Everingham (Zuse Institute Berlin, DE), Max Lubke (University of Potsdam, DE), Marco De Lucia (GFZ Helmholtz Centre for Geosciences, DE), Danny Puhan (PERFACCT GmbH, DE), Niklas Schelten (Fraunhofer Heinrich Hertz Institute, DE), Bettina Schnor (University of Potsdam, DE), Hannes Signer (University of Potsdam, DE), Johannes Spazier (PERFACCT GmbH, DE), Benno Stabernack (Fraunhofer Heinrich Hertz Institute and University of Potsdam, DE), Fritjof Steinert (Fraunhofer Heinrich Hertz Institute and University of Potsdam, DE), Serhii Yahdzhyiev (PERFACCT GmbH, DE)
2:40 pm
Scheduling Strategies for Partially-Replicable Task Chains on Two Types of Resources
Diane Orhan (University of Bordeaux, FR), Yacine Idouar (Sorbonne University, FR), Laércio Lima Pilla (University of Bordeaux, FR), Adrien Cassagne (Sorbonne University, FR), Denis Barthou (Bordeaux INP, FR), Christophe Jego (University of Bordeaux, FR)
3 pm
Heterogeneous Memory Pool Tuning
Filip Vaverka (Technical University of Ostrava, CZ), Ondrej Vysocky (Technical University of Ostrava, CZ), Lubomir Riha (Technical University of Ostrava, CZ)
3:10 pm
On the Singularity of SYCL
Ami Marowka (Parallel Research Labs, IL)
3:20 pm
Proactive Endpoint Congestion Avoidance in UCC
Ferrol Aderholdt (NVIDIA, US), Aamir Shafi (NVIDIA, US), Manjunath Gorentla Venkata (NVIDIA, US)
Break (3:30-3:40 pm)
Session 4: Best Paper Award (3:40-4 pm)
Session Chairs: DK Panda (The Ohio State University, US) and Ali Akoglu (University of Arizona, US)
Best Paper Award
Congratulations to Diane Orhan, Yacine Idouar, Laércio Lima Pilla, Adrien Cassagne, Denis Barthou, and Christophe Jego! Their paper titled Scheduling Strategies for Partially-Replicable Task Chains on Two Types of Resources received the HCW 2025 Best Paper Award!
Author Discussions and Closing Remarks
Based on their outstanding and insightful reviews, we are pleased to acknowledge Abdou Guermouche (University of Bordeaux, FR), Sahil Hassan (University of Arizona, US), Joshua Mack (Praetorian, US), and José Rufino (Polytechnic Institute of Bragança, PT) as recipients of the Top Reviewer Recognition for HCW 2025. Their thoughtful and constructive reviews not only upheld the workshop’s high standards but also provided authors with valuable feedback to improve their work. We truly appreciate their dedication and commitment to HCW.
HCW 2025 Call for Papers
June 4, 2025
Milan, Italy
In conjunction with the 39th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2025)
Sponsored by the IEEE Computer Society
through the Technical Committee on Parallel Processing (TCPP)
Most modern computing systems are heterogeneous, either for organic reasons because components grew independently, as it is the case in desktop grids, or by design to leverage the strength of specific hardware, as it is the case in accelerated systems. In any case, all computing systems have some form of hardware or software heterogeneity that must be managed, leveraged, understood, and exploited. The Heterogeneity in Computing Workshop (HCW) is a venue to discuss and innovate in all theoretical and practical aspects of heterogeneous computing: design, programmability, efficient utilization, algorithms, modeling, applications, etc. HCW 2025 will be the thirty-fourth annual gathering of this workshop.
Topics
Topics of interest include but are not limited to the following areas:
Heterogeneous multicore systems and architectures: Design, exploration, and experimental analysis of heterogeneous computing systems such as Graphics Processing Units, heterogeneous systems-on-chip, Artificial Intelligence chips, Field Programmable Gate Arrays, big.LITTLE, and application-specific architectures.
Heterogeneous parallel and distributed systems: Design and analysis of computing grids, cloud systems, hybrid clusters, datacenters, geo-distributed computing systems, and supercomputers.
Deep memory hierarchies: Design and analysis of memory hierarchies with SRAM, DRAM, Flash/SSD, and HDD technologies; NUMA architectures; cache coherence strategies; novel memory systems such as phase-change RAM, magnetic (e.g., STT) RAM, 3D Xpoint/crossbars, and memristors.
On-chip, off-chip, and heterogeneous network architectures: Network-on-chip (NoC) architectures and protocols for heterogeneous multicore applications; energy, latency, reliability, and security optimizations for NoCs; off-chip (chip-to-chip) network architectures and optimizations; heterogeneous networks (combination of NoC and off-chip) design, evaluation, and optimizations; large-scale parallel and distributed heterogeneous network design, evaluation, and optimizations.
Programming models and tools: Programming paradigms and tools for heterogeneous systems; middleware and runtime systems; performance-abstraction tradeoff; interoperability of heterogeneous software environments; workflows; dataflows.
Resource management and algorithms for heterogeneous systems: Parallel algorithms for solving problems on heterogeneous systems (e.g., multicores, hybrid clusters, grids, or clouds); strategies for scheduling and allocation on heterogeneous 2D and 3D multicore architectures; static and dynamic scheduling and resource management for large-scale and parallel heterogeneous systems.
Modeling, characterization, and optimizations: Performance models and their use in the design of parallel and distributed algorithms for heterogeneous platforms; characterizations and optimizations for improving the time to solve a problem (e.g., throughput, latency, runtime); modeling and optimizing electricity consumption (e.g., power, energy); modeling for failure management (e.g., fault tolerance, recovery, reliability); modeling for security in heterogeneous platforms.
Applications on heterogeneous systems: Case studies; confluence of Big Data systems and heterogeneous systems; data-intensive computing; scientific computing.
This year we wish to focus on and expand submissions and presentations in the following “hot topics” areas; therefore, we especially invite submissions in the following four areas:
Heterogeneous Integration of Quantum Computing: Design, exploration, and analysis of architectures and software frameworks enabling heterogeneous integration of classical computing and quantum computing (e.g., heterogeneous quantum computers, error correction, heterogeneous applications that use both classical and quantum logic, benchmarks for heterogeneous quantum computers).
Heterogeneity and Interoperability in Software & Data Systems: Design, exploration, and analysis of architectures and software frameworks for interoperability in software and data systems (e.g., semantic frameworks, interoperability for heterogeneous Internet-of-Things systems, model-driven frameworks).
Heterogeneous Computing for Machine Learning (ML) and Deep Learning (DL): Design, exploration, benchmarking, and analysis of accelerators and software frameworks for ML and DL applications on heterogeneous computing systems.
Closing the loop on the design of heterogeneous compilers, runtimes, and hardware: As the needs of heterogeneous hardware apply pressure on runtime designers to adjust for the complexities of heterogeneous resource management, runtimes are now applying pressure back towards compiler designers to include all relevant information – such as data flow and dependency analysis or hardware-specific representations of application tasks – in their binaries to enable resource management policies to arbitrate effectively. Advancements in machine understanding of code are critical in enabling progress here with a holistic view of compilers, runtimes and heterogeneous hardware.
Important Dates
- Paper submission: February 9, 2025
- Author notification: February 24, 2025
- Camera-ready submission: March 6, 2025
Paper Submissions
Manuscripts submitted to HCW 2025 should not have been previously published or be under review for a different workshop, conference, or journal.
Submissions must use the latest IEEE manuscript templates for conference proceedings. Submissions may not exceed a total of ten single-spaced double-column pages using 10-point size font on 8.5x11 inch pages. The page limit includes figures, tables, and references. A single-blind review process will be followed.
Files should be submitted by following the instructions at the IPDPS 2025 submission site.
New this year, we plan to recognize an outstanding HCW 2025 publication with a Best Paper Award. The Best Paper Award will be determined by taking into account the recommendations provided by the Technical Program Committee, along with detailed evaluations of the paper’s originality, significance, and overall quality.
Workshop Organization
General Co-Chairs: DK Panda and Hari Subramoni, The Ohio State University, USA
Technical Program Committee Chair: Ali Akoglu, University of Arizona, USA
Questions may be sent to the HCW 2025 General Co-Chairs (DK Panda: panda.1 at osu dot edu, Hari Subramoni: subramoni.1 at osu dot no) or the Technical Program Committee Chair (Ali Akoglu: akoglu at arizona dot edu).
Technical Program Committee
Shashank Adavally, Micron Technology, USA
Mohsen Amini Salehi, University of North Texas, USA
Mehmet Belviranli, Colorado School of Mines, USA
Gonzalo Brito Gadeschi, NVIDIA Corporation, Germany
Nick Brown, University of Edinburgh, Scotland
Daniel Cordeiro, University of São Paulo, Brazil
Matthias Diener, University of Illinois, USA
Murali Emani, Argonne National Laboratory, USA
Jiří Filipovič, Masaryk University, Czech Republic
Abdou Guermouche, University of Bordeaux, France
Yanfei Guo, Argonne National Laboratory, USA
Diana Göhringer, Technische Universität Dresden, Germany
Sahil Hassan, University of Arizona, USA
Emmanuel Jeannot, INRIA, France
Krishna Kavi, University of North Texas, USA
Georgios Keramidas, Aristotle University, Greece
Joongheon Kim, Korea University, Korea
Joanna Kolodziej, Cracow University of Technology, Poland
Alexey Lastovetsky, University College Dublin, Ireland
Seyong Lee, Oak Ridge National Laboratory, USA
Laércio Lima Pilla, CNRS, France
Hatem Ltaief, King Abdullah University of Science and Technology, Saudi Arabia
Joshua Mack, Praetorian, USA
Joseph Manzano, Pacific Northwest National Laboratory, USA
Matthias Mueller, Aachen University, Germany
Sridhar Radhakrishnan, University of Oklahoma, USA
José Rufino, Polytechnic Institute of Bragança, Portugal
Marco Domenico Santambrogio, Politecnico di Milano, Italy
Aamir Shafi, The Ohio State University, USA
Sameer Shende, University of Oregon; ParaTools, Inc., USA
Achim Streit, Karlsruhe Institute of Technology, Germany
Shubbhi Taneja, Worcester Polytechnic Institute, USA
Samuel Thibault, University of Bordeaux, France
Claire Vishik, Intel, USA
Logan Ward, Argonne National Laboratory, USA
Steering Committee
Kamesh Madduri, Pennsylvania State University, USA (Chair)
Behrooz Shirazi, National Science Foundation, USA (Immediate Past Chair)
H. J. Siegel, Colorado State University, USA (Past Chair)
John Antonio, University of Oklahoma, USA
David Bader, New Jersey Institute of Technology, USA
Anne Benoit, École Normale Supérieure de Lyon, France
Jack Dongarra, University of Tennessee, USA
Alexey Lastovetsky, University College Dublin, UK
Sudeep Pasricha, Colorado State University, USA
Viktor K. Prasanna, University of Southern California, USA
Yves Robert, École Normale Supérieure de Lyon, France
Erik Saule, University of North Carolina at Charlotte, USA
Uwe Schwiegelshohn, TU Dortmund University, Germany
Sponsors
IEEE IPDPS 2025 is sponsored by the IEEE Computer Society, through the Technical Committee on Parallel Processing (TCPP), and is held in cooperation with the IEEE Computer Society Technical Committees on Computer Architecture (TCCA) and Distributed Processing (TCDP).
HCW 2025 is sponsored by the U.S. Office of Naval Research and IEEE IPDPS 2025.
