ABSTRACT
We present an efficient and performance portable implementation of the Simple Cloud Resolving E3SM Atmosphere Model (SCREAM). SCREAM is a full featured atmospheric global circulation model with a nonhydrostatic dynamical core and state-of-the-art parameterizations for microphysics, moist turbulence and radiation. It has been written from scratch in C++ with the Kokkos library used to abstract the on-node execution model for both CPUs and GPUs. SCREAM is one of only a few global atmosphere models to be ported to GPUs. As far as we know, SCREAM is the first such model to run on both AMD GPUs and NVIDIA GPUs, as well as the first to run on nearly an entire Exascale system (Frontier). On Frontier, we obtained a record setting performance of 1.26 simulated years per day for a realistic cloud resolving simulation.
- Tal Ben-Nun, Linus Groner, Florian Deconinck, Tobias Wicky, Eddie Davis, Johann Dahm, Oliver D. Elbert, Rhea George, Jeremy McGibbon, Lukas Trümper, Elynn Wu, Oliver Fuhrer, Thomas Schulthess, and Torsten Hoefler. 2022. Productive Performance Engineering for Weather and Climate Modeling with Python. In SC22: International Conference for High Performance Computing, Networking, Storage and Analysis. 1--14. Google ScholarCross Ref
- L. Bertagna, M. Deakin, O. Guba, D. Sunderland, A. M. Bradley, I. K. Tezaur, M. A. Taylor, and A. G. Salinger. 2019. HOMMEXX 1.0: a performance-portable atmospheric dynamical core for the Energy Exascale Earth System Model. Geoscientific Model Development 12, 4 (2019), 1423--1441. Google ScholarCross Ref
- Luca Bertagna, Oksana Guba, Mark A. Taylor, James G. Foucar, Jeff Larkin, Andrew M. Bradley, Sivasankaran Rajamanickam, and Andrew G. Salinger. 2020. A Performance-Portable Nonhydrostatic Atmospheric Dycore for the Energy Exascale Earth System Model Running at Cloud-Resolving Resolutions. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. 1--14. Google ScholarCross Ref
- Kotaro Bessho, Kenji Date, Masahiro Hayashi, Akio Ikeda, Takahito Imai, Hidekazu Inoue, Yukihiro Kumagai, Takuya Miyakawa, Hidehiko Murata, Tomoo Ohno, Arata Okuyama, Ryo Oyama, Yukio Sasaki, Yoshio Shimazu, Kazuki Shimoji, Yasuhiko Sumida, Masuo Suzuki, Hidetaka Taniguchi, Hiroaki Tsuchiyami, Daisaku Uesawa, Hironobu Yokuta, and Ryo Yoshida. 2016. An Introduction to Himawari-8/9; Japan's New-Generation Geostationary Meteorological Satellites. Journal of the Meteorological Society of Japan. Ser. II 94, 2 (2016), 151--183. Google ScholarCross Ref
- Peter Bogenschutz and Steven K. Krueger. 2013. A simplified PDF parameterization of subgrid-scale clouds and turbulence for cloud-resolving models. Journal of Advances in Modeling Earth Systems 5, 2 (2013), 195--211. arXiv:https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1002/jame.20018 Google ScholarCross Ref
- A. M. Bradley, P. A. Bosler, and O. Guba. 2022. Islet: interpolation semi-Lagrangian element-based transport. Geoscientific Model Development 15, 16 (2022), 6285--6310. Google ScholarCross Ref
- P. M. Caldwell, C. R. Terai, B. Hillman, N. D. Keen, P. Bogenschutz, W. Lin, H. Beydoun, M. Taylor, L. Bertagna, A. M. Bradley, T. C. Clevenger, A. S. Donahue, C. Eldred, J. Foucar, J.-C. Golaz, O. Guba, R. Jacob, J. Johnson, J. Krishna, W. Liu, K. Pressel, A. G. Salinger, B. Singh, A. Steyer, P. Ullrich, D. Wu, X. Yuan, J. Shpund, H.-Y. Ma, and C. S. Zender. 2021. Convection-Permitting Simulations With the E3SM Global Atmosphere Model. Journal of Advances in Modeling Earth Systems 13, 11 (2021), e2021MS002544. arXiv:https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2021MS002544 Google ScholarCross Ref
- H. Carter Edwards, Christian R. Trott, and Daniel Sunderland. 2014. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distr. Com. 74, 12 (2014), 3202--3216.Google ScholarDigital Library
- National Energy Research Scientific Computing Center. 2023. PERLMUTTER. Retrieved 2023-04-12 from https://docs.nersc.gov/systems/perlmutterGoogle Scholar
- A. Costello, M. Abbas, A. Allen, S. Ball, S. Bell, R. Bellamy, S. Friel, N. Groce, A. Johnson, and M. Kett. 2009. Managing the health effects of climate change. Lancet 373 (2009), 1693--1733.Google ScholarCross Ref
- Peter D. Dueben, Nils Wedi, Sami Saarinen, and Christian Zeman. 2020. Global Simulations of the Atmosphere at 1.45 km Grid-Spacing with the Integrated Forecasting System. Journal of the Meteorological Society of Japan. Ser. II 98, 3 (2020), 551--572. Google ScholarCross Ref
- Oak Ridge Leadership Facility. 2023. FRONTIER. Retrieved 2023-04-12 from https://www.olcf.ornl.gov/olcf-resources/compute-systems/frontierGoogle Scholar
- Oak Ridge Leadership Facility. 2023. SUMMIT Oak Ridge National Laboratory's 200 petaflop supercomputer. Retrieved 2023-04-12 from https://www.olcf.ornl.gov/olcf-resources/compute-systems/summit/Google Scholar
- G. Flato, J. Marotzke, B. Abiodun, P. Braconnot, S.C. Chou, W. Collins, P. Cox, F. Driouech, S. Emori, V. Eyring, C. Forest, P. Gleckler, E. Guilyardi, C. Jakob, V. Kattsov, C. Reason, and M. Rummukainen. 2013. Evaluation of Climate Models. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, Book section 9, 741--866. Google ScholarCross Ref
- Haohuan Fu, Junfeng Liao, Nan Ding, Xiaohui Duan, Lin Gan, Yishuang Liang, Xinliang Wang, Jinzhe Yang, Yan Zheng, Weiguo Liu, Lanning Wang, and Guangwen Yang. 2017. Redesigning CAM-SE for Peta-scale Climate Modeling Performance and Ultra-high Resolution on Sunway TaihuLight. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (Denver, Colorado) (SC '17). ACM, New York, NY, USA, Article 1, 12 pages. Google ScholarDigital Library
- O. Fuhrer, T. Chadha, T. Hoefler, G. Kwasniewski, X. Lapillonne, D. Leutwyler, D. Lüthi, C. Osuna, C. Schär, T. C. Schulthess, and H. Vogt. 2018. Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0. Geosci. Model Dev. 11, 4 (2018), 1665--1681. Google ScholarCross Ref
- M. A. Giorgetta, W. Sawyer, X. Lapillonne, P. Adamidis, D. Alexeev, V. Clément, R. Dietlicher, J. F. Engels, M. Esch, H. Franke, C. Frauen, W. M. Hannah, B. R. Hillman, L. Kornblueh, P. Marti, M. R. Norman, R. Pincus, S. Rast, D. Reinert, R. Schnur, U. Schulzweida, and B. Stevens. 2022. The ICON-A model for direct QBO simulations on GPUs (version icon-cscs:baf28a514). Geoscientific Model Development 15, 18 (2022), 6985--7016. Google ScholarCross Ref
- William F. Godoy, Norbert Podhorszki, Ruonan Wang, Chuck Atkins, Greg Eisenhauer, Junmin Gu, Philip Davis, Jong Choi, Kai Germaschewski, Kevin Huck, Axel Huebl, Mark Kim, James Kress, Tahsin Kurc, Qing Liu, Jeremy Logan, Kshitij Mehta, George Ostrouchov, Manish Parashar, Franz Poeschel, David Pugmire, Eric Suchyta, Keichi Takahashi, Nick Thompson, Seiji Tsutsumi, Lipeng Wan, Matthew Wolf, Kesheng Wu, and Scott Klasky. 2020. ADIOS 2: The Adaptable Input Output System. A framework for high-performance data management. SoftwareX 12 (2020), 100561. Google ScholarCross Ref
- Jun Gu, Jiawang Feng, Xiaoyu Hao, Tao Fang, Chun Zhao, Hong An, Junshi Chen, Mingyue Xu, Jian Li, Wenting Han, Chao Yang, Fang Li, and Dexun Chen. 2022. Establishing a non-hydrostatic global atmospheric modeling system at 3-km horizontal resolution with aerosol feedbacks on the Sunway supercomputer of China. Science Bulletin 67, 11 (2022), 1170--1181. Google ScholarCross Ref
- O. Guba, M.A. Taylor, and A. St.-Cyr. 2014. Optimization based limiters for the spectral element method. J. Comput. Phys. 267 (2014), 176--195. Google ScholarCross Ref
- O. Guba, M.A. Taylor, P. Ullrich, J.R. Overfelt, and M.N. Levy. 2014. The spectral element method on variable resolution grids: Evaluating grid sensitivity and resolution-aware numerical viscosity. Geosci. Model Dev. 7 (2014), 4081--4117. Google ScholarCross Ref
- O. Guba, M. A. Taylor, A. M. Bradley, P. A. Bosler, and A. Steyer. 2020. A framework to evaluate IMEX schemes for atmospheric models. Geoscientific Model Development 13, 12 (2020), 6467--6480. Google ScholarCross Ref
- Walter M. Hannah, Andrew M. Bradley, Oksana Guba, Qi Tang, Jean-Christophe Golaz, and Jon Wolfe. 2021. Separating Physics and Dynamics Grids for Improved Computational Efficiency in Spectral Element Earth System Models. Journal of Advances in Modeling Earth Systems 13, 7 (2021), e2020MS002419. arXiv:https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2020MS002419 Google ScholarCross Ref
- Edward Hartnett and Jim Edwards. 2021. The parallelio (PIO) C/FORTRAN libraries for scalable HPC performance. In 37th Conference on Environmental Information Processing Technologies, American Meteorological Society Annual Meeting. 10--15.Google Scholar
- C. Kodama, T. Ohno, T. Seiki, H. Yashiro, A. T. Noda, M. Nakano, Y. Yamada, W. Roh, M. Satoh, T. Nitta, D. Goto, H. Miura, T. Nasuno, T. Miyakawa, Y.-W. Chen, and M. Sugi. 2021. The Nonhydrostatic ICosahedral Atmospheric Model for CMIP6 HighResMIP simulations (NICAM16-S): experimental design, model description, and impacts of model updates. Geoscientific Model Development 14, 2 (2021), 795--820. Google ScholarCross Ref
- L. Ruby Leung, David C. Bader, Mark A. Taylor, and Renata B. McCoy. 2020. An Introduction to the E3SM Special Collection: Goals, Science Drivers, Development, and Analysis. Journal of Advances in Modeling Earth Systems 12, 11 (Nov. 2020). Google ScholarCross Ref
- Jianwei Li, Wei keng Liao, A. Choudhary, R. Ross, R. Thakur, W. Gropp, R. Latham, A. Siegel, B. Gallagher, and M. Zingale. 2003. Parallel netCDF: A High-Performance Scientific I/O Interface. In SC '03: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing. 39--39. Google ScholarCross Ref
- H. Morrison and J. A. Milbrandt. 2015. Parameterization of cloud microphysics based on the prediction of the bulk ice particle properties. Part I: Scheme description and idealized tests. J. Atmos. Sci. 72 (2015), 287--311.Google ScholarCross Ref
- Philipp Neumann, Peter Düben, Panagiotis Adamidis, Peter Bauer, Matthias Brueck, Luis Kornblueh, Daniel Klocke, Bjorn Stevens, Nils Wedi, and Joachim Biercamp. 2019. Assessing the scales in numerical weather and climate predictions: Will exascale be the rescue? Philosophical transactions. Series A, Mathematical, physical, and engineering sciences 377 (04 2019). Google ScholarCross Ref
- Matthew Norman, Isaac Lyngaas, Abhishek Bagusetty, and Mark Berrill. 2022. Portable C++ Code that can Look and Feel Like Fortran Code with Yet Another Kernel Launcher (YAKL). International Journal of Parallel Programming (2022), 1--22.Google Scholar
- NASA Earth Observatory. 2005. Blue Marble Next Generation. Retrieved 2023-10-07 from https://earthobservatory.nasa.gov/features/BlueMarbleGoogle Scholar
- Tim Palmer. 2014. Climate forecasting: Build high-resolution global climate models. Nature News 515, 7527 (2014), 338.Google ScholarCross Ref
- Robert Pincus, Eli J. Mlawer, and Jennifer S. Delamere. 2019. Balancing Accuracy, Efficiency, and Flexibility in Radiation Calculations for Dynamical Models. Journal of Advances in Modeling Earth Systems 11, 10 (2019), 3074--3089. arXiv:https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2019MS001621 Google ScholarCross Ref
- E3SM Project. 2023. Energy Exascale Earth System Model. Retrieved 2023-10-07 from https://e3sm.org/Google Scholar
- William M. Putman and Max Suarez. 2011. Cloud-system resolving simulations with the NASA Goddard Earth Observing System global atmospheric model (GEOS-5). Geophysical Research Letters 38, 16 (2011). Google ScholarCross Ref
- David Randall, Marat Khairoutdinov, Akio Arakawa, and Wojciech Grabowski. 2003. Breaking the Cloud Parameterization Deadlock. Bulletin of the American Meteorological Society 84, 11 (2003), 1547 -- 1564. Google ScholarCross Ref
- J. Rosinksi. 2017. GPTL - General Purpose Timing Library. https://jmrosinski.github.io/GPTLGoogle Scholar
- B.M. Sanderson, C. Piani, W.J. Ingram, D.A. Stone, and M. R. Allen. 2008. Towards constraining climate sensitivity by linear analysis of feedback patterns in thousands of perturbed-physics GCM simulations. Clim Dyn 30 (2008), 175--190.Google ScholarCross Ref
- Mitsuhisa Sato, Yutaka Ishikawa, Hirofumi Tomita, Yuetsu Kodama, Tetsuya Odajima, Miwako Tsuji, Hisashi Yashiro, Masaki Aoki, Naoyuki Shida, Ikuo Miyoshi, Kouichi Hirai, Atsushi Furuya, Akira Asato, Kuniki Morita, and Toshiyuki Shimizu. 2020. Co-Design for A64FX Manycore Processor and "Fugaku". In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. 1--15. Google ScholarCross Ref
- M. Satoh, T. Matsuno, H. Tomita, H. Miura, T. Nasuno, and S. Iga. 2008. Nonhydrostatic icosahedral atmospheric model (NICAM) for global cloud resolving simulations. J. Comput. Phys. 227, 7 (2008), 3486--3514. Predicting weather, climate and extreme events. Google ScholarDigital Library
- Masaki Satoh, Bjorn Stevens, Falko Judt, Marat Khairoutdinov, Shian-Jiann Lin, William M. Putman, and Peter Düben. 2019. Global Cloud-Resolving Models. Current Climate Change Reports 5, 3 (May 2019), 172--184. Google ScholarCross Ref
- Masaki Satoh, Hirofumi Tomita, Hisashi Yashiro, Hiroaki Miura, Chihiro Kodama, Tatsuya Seiki, Akira Noda, Yohei Yamada, Daisuke Goto, Masahiro Sawada, Takemasa Miyoshi, Yosuke Niwa, Masayuki Hara, Tomoki Ohno, Shin-ichi Iga, Takashi Arakawa, Takahiro Inoue, and Hiroyasu Kubokawa. 2014. The Non-hydrostatic Icosahedral Atmospheric Model: description and development. Progress in Earth and Planetary Science 1 (2014), 18. Google ScholarCross Ref
- T. Schneider, J. Teixeira, C. Bretherton, F. Brient, K. G. Pressel, C Shar, and A. P. Siebesma. 2017. Climate goals and computing the future of clouds. Nature Clim Change 7 (2017), 3--5.Google ScholarCross Ref
- Steven C Sherwood, Sandrine Bony, and Jean-Louis Dufresne. 2014. Spread in model climate sensitivity traced to atmospheric convective mixing. Nature 505, 7481 (2014), 37--42.Google Scholar
- W. C. Skamarock, J. B. Klemp, M. G. Duda, L. D. Fowler, S.-H. Park, and T. Ringler. 2012. A multiscale nonhydrostatic atmospheric model using centroidal Voronoi tesselations and C-grid staggering. Mon. Wea. Rev. 140 (2012), 3090--3105.Google ScholarCross Ref
- Julia Slingo, Paul Bates, Peter Bauer, Stephen Belcher, Tim Palmer, Graeme Stephens, Bjorn Stevens, Thomas Stocker, and Georg Teutsch. 2022. Ambitious partnership needed for reliable climate prediction. Nature Climate Change 12 (06 2022), 499--503. Google ScholarCross Ref
- Sarat Sreepathi and Mark Taylor. 2021. Early Evaluation of Fugaku A64FX Architecture Using Climate Workloads. In 2021 IEEE International Conference on Cluster Computing (CLUSTER). 719--727. Google ScholarCross Ref
- Daniel Steel, C. Tyler DesRoches, and Kian Mintz-Woo. 2022. Climate change and the threat to civilization. Proceedings of the National Academy of Sciences 119, 42 (2022), e2210525119. arXiv:https://www.pnas.org/doi/pdf/10.1073/pnas.2210525119 Google ScholarCross Ref
- Bjorn Stevens and Sandrine Bony. 2013. What are climate models missing? Science 340, 6136 (2013), 1053--1054.Google Scholar
- Bjorn Stevens, Masaki Satoh, Ludovic Auger, Joachim Biercamp, Christopher S. Bretherton, Xi Chen, Peter Düben, Falko Judt, Marat Khairoutdinov, Daniel Klocke, Chihiro Kodama, Luis Kornblueh, Shian-Jiann Lin, Philipp Neumann, William M. Putman, Niklas Röber, Ryosuke Shibuya, Benoit Vanniere, Pier Luigi Vidale, Nils Wedi, and Linjiong Zhou. 2019. DYAMOND: the DYnamics of the Atmospheric general circulation Modeled On Non-hydrostatic Domains. Progress in Earth and Planetary Science 6, 1 (Sept. 2019). Google ScholarCross Ref
- Q. Tang, J.-C. Golaz, L. P. Van Roekel, M. A. Taylor, W. Lin, B. R. Hillman, P. A. Ullrich, A. M. Bradley, O. Guba, J. D. Wolfe, T. Zhou, K. Zhang, X. Zheng, Y. Zhang, M. Zhang, M. Wu, H. Wang, C. Tao, B. Singh, A. M. Rhoades, Y. Qin, H.-Y. Li, Y. Feng, Y. Zhang, C. Zhang, C. S. Zender, S. Xie, E. L. Roesler, A. F. Roberts, A. Mametjanov, M. E. Maltrud, N. D. Keen, R. L. Jacob, C. Jablonowski, O. K. Hughes, R. M. Forsyth, A. V. Di Vittorio, P. M. Caldwell, G. Bisht, R. B. McCoy, L. R. Leung, and D. C. Bader. 2022. The Fully Coupled Regionally Refined Model of E3SM Version 2: Overview of the Atmosphere, Land, and River. Geoscientific Model Development Discussions 2022 (2022), 1--64. Google ScholarCross Ref
- Q. Tang, S. A. Klein, S. Xie, W. Lin, J.-C. Golaz, E. L. Roesler, M. A. Taylor, P. J. Rasch, D. C. Bader, L. K. Berg, P. Caldwell, S. E. Giangrande, R. B. Neale, Y. Qian, L. D. Riihimaki, C. S. Zender, Y. Zhang, and X. Zheng. 2019. Regionally refined test bed in E3SM atmosphere model version 1 (EAMv1) and applications for highresolution modeling. Geoscientific Model Development 12, 7 (2019), 2679--2706. Google ScholarCross Ref
- M. A. Taylor and A. Fournier. 2010. A compatible and conservative spectral element method on unstructured grids. J. Comput. Phys. 229 (2010), 5879-- 5895. Google ScholarDigital Library
- Mark A. Taylor, Oksana Guba, Andrew Steyer, Paul A. Ullrich, David M. Hall, and Christopher Eldrid. 2020. An Energy Consistent Discretization of the Nonhydrostatic Equations in Primitive Variables. J. Adv. Model Earth Sy. 12, 1 (2020). Google ScholarCross Ref
- Koji Terasaki and Takemasa Miyoshi. 2022. A 1024-Member NICAM-LETKF Experiment for the July 2020 Heavy Rainfall Event. SOLA 18A, Special_Edition (2022), 8--14. Google ScholarCross Ref
- The HDF Group. 2000--2023. Hierarchical Data Format version 5 (HDF5). http://www.hdfgroup.org/HDF5Google Scholar
- H Tomita, H Miura, S Iga, T Nasuno, and M Satoh. 2005. A global cloud-resolving simulation: Preliminary results from an aqua planet experiment. Geophysical Research Letters 32, 8 (2005).Google ScholarCross Ref
- TOP500.org. 2022. TOP500 supercomputer sites. Retrieved 2022-12-01 from https://top500.orgGoogle Scholar
- Christian R. Trott, Damien Lebrun-Grandié, Daniel Arndt, Jan Ciesko, Vinh Dang, Nathan Ellingwood, Rahulkumar Gayatri, Evan Harvey, Daisy S. Hollman, Dan Ibanez, Nevin Liber, Jonathan Madsen, Jeff Miles, David Poliakoff, Amy Powell, Sivasankaran Rajamanickam, Mikael Simberg, Dan Sunderland, Bruno Turcksin, and Jeremiah Wilke. 2022. Kokkos 3: Programming Model Extensions for the Exascale Era. IEEE Transactions on Parallel and Distributed Systems 33, 4 (2022), 805--817. Google ScholarCross Ref
- Unidata. 2021. Network Common Data Form (NetCDF). Google ScholarCross Ref
- Nils P. Wedi, Inna Polichtchouk, Peter Dueben, Valentine G. Anantharaj, Peter Bauer, Souhail Boussetta, Philip Browne, Willem Deconinck, Wayne Gaudin, Ioan Hadade, Sam Hatfield, Olivier Iffrig, Philippe Lopez, Pedro Maciel, Andreas Mueller, Sami Saarinen, Irina Sandu, Tiago Quintino, and Frederic Vitart. 2020. A Baseline for Global Weather and Climate Simulations at 1 km Resolution. Journal of Advances in Modeling Earth Systems 12, 11 (2020), e2020MS002192. arXiv:https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2020MS002192e2020MS00219210.1029/2020MS002192. Google ScholarCross Ref
- C. Yang, W. Xue, H. Fu, H. You, X. Wang, Y. Ao, F. Liu, L. Gan, P. Xu, L. Wang, G. Yang, and W. Zheng. 2016. 10M-Core Scalable Fully-Implicit Solver for Nonhydrostatic Atmospheric Dynamics. In SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 57--68. Google ScholarCross Ref
Index Terms
- The Simple Cloud-Resolving E3SM Atmosphere Model Running on the Frontier Exascale System
Recommendations
Accelerating the 3D euler atmospheric solver through heterogeneous CPU-GPU platforms
CF '16: Proceedings of the ACM International Conference on Computing FrontiersIn climate change studies, the atmospheric model is an essential component for building a high-resolution climate simulation system. While the accuracy of atmospheric simulations has long been limited by the computational capabilities of CPU platforms, ...
A performance-portable nonhydrostatic atmospheric dycore for the energy exascale earth system model running at cloud-resolving resolutions
SC '20: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisWe present an effort to port the nonhydrostatic atmosphere dynamical core of the Energy Exascale Earth System Model (E3SM) to efficiently run on a variety of architectures, including conventional CPU, many-core CPU, and GPU. We specifically target cloud-...
An optimized large-scale hybrid DGEMM design for CPUs and ATI GPUs
ICS '12: Proceedings of the 26th ACM international conference on SupercomputingIn heterogeneous systems that include CPUs and GPUs, the data transfers between these components play a critical role in determining the performance of applications. Software pipelining is a common approach to mitigate the overheads of those transfers. ...
Comments