Quantum ESPRESSO is a widely used open-source package for Density Functional Theory (DFT) calculations. For large systems, its performance is strongly affected by parallel efficiency and load imbalance, particularly in band-parallel and linear-response calculations. This thesis investigates performance bottlenecks in the cgsolve all solver of the PH.x module, which uses a band-parallel preconditioned conjugate gradient method. In the standard implementation, bands are distributed evenly among band groups using a static scheme. However, the computational cost per band is not uniform, leading to idle time and reduced scalability. Tomitigate this issue, a dynamic band-balancing strategy is proposed. The method measures runtime imbalance and adaptively redistributes bands between groups to reduce waiting time. Performance tests on CPU and GPU architectures demonstrate improved load balancing and stronger scaling behavior, particularly for heterogeneous systems, where performance gains of 20–27% in CPU time were observed under band-parallel scaling. More homogeneous systems demonstrate smaller but consistent improvements. In addition, this work analyses the role of real-space projectors in the overall computational cost. The section on Optimization of Real Space Projectors identifies key sources of imbalance and communication overhead in projector-related routines. Althoughfullalgorithmicrestructuringwasbeyondtheavailabletimeframe, theproblem, main bottleneck and optimization strategies are discussed, including improved data locality, better band–projector mapping, and alternative communication patterns. Overall, thisthesisdemonstratesthatdynamicworkloadredistributionandcarefulanalysisofreal-spaceprojector routines are effective approaches to improve scalability in band-parallel electronic-structure calculations.

Band Parallelization Strategies and Optimization of Real Space Projectors in Quantum ESPRESSO(2026 Mar 27).

Band Parallelization Strategies and Optimization of Real Space Projectors in Quantum ESPRESSO

-
2026-03-27

Abstract

Quantum ESPRESSO is a widely used open-source package for Density Functional Theory (DFT) calculations. For large systems, its performance is strongly affected by parallel efficiency and load imbalance, particularly in band-parallel and linear-response calculations. This thesis investigates performance bottlenecks in the cgsolve all solver of the PH.x module, which uses a band-parallel preconditioned conjugate gradient method. In the standard implementation, bands are distributed evenly among band groups using a static scheme. However, the computational cost per band is not uniform, leading to idle time and reduced scalability. Tomitigate this issue, a dynamic band-balancing strategy is proposed. The method measures runtime imbalance and adaptively redistributes bands between groups to reduce waiting time. Performance tests on CPU and GPU architectures demonstrate improved load balancing and stronger scaling behavior, particularly for heterogeneous systems, where performance gains of 20–27% in CPU time were observed under band-parallel scaling. More homogeneous systems demonstrate smaller but consistent improvements. In addition, this work analyses the role of real-space projectors in the overall computational cost. The section on Optimization of Real Space Projectors identifies key sources of imbalance and communication overhead in projector-related routines. Althoughfullalgorithmicrestructuringwasbeyondtheavailabletimeframe, theproblem, main bottleneck and optimization strategies are discussed, including improved data locality, better band–projector mapping, and alternative communication patterns. Overall, thisthesisdemonstratesthatdynamicworkloadredistributionandcarefulanalysisofreal-spaceprojector routines are effective approaches to improve scalability in band-parallel electronic-structure calculations.
27-mar-2026
Delugas, Pietro Davide
de Gironcoli, Stefano Maria
File in questo prodotto:
File Dimensione Formato  
thesis_ParedesTorres.pdf

accesso aperto

Tipologia: Tesi
Licenza: Non specificato
Dimensione 3.85 MB
Formato Adobe PDF
3.85 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11767/151832
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact