World Library  

Add to Book Shelf
Flag as Inappropriate
Email this Book

Famous, Faster: Using Parallel Computing Techniques to Accelerate the Famous/Hadcm3 Climate Model with a Focus on the Radiative Transfer Algorithm : Volume 4, Issue 3 (27/09/2011)

By Hanappe, P.

Click here to view

Book Id: WPLBN0003987645
Format Type: PDF Article :
File Size: Pages 10
Reproduction Date: 2015

Title: Famous, Faster: Using Parallel Computing Techniques to Accelerate the Famous/Hadcm3 Climate Model with a Focus on the Radiative Transfer Algorithm : Volume 4, Issue 3 (27/09/2011)  
Author: Hanappe, P.
Volume: Vol. 4, Issue 3
Language: English
Subject: Science, Geoscientific, Model
Collections: Periodicals: Journal and Magazine Collection (Contemporary), Copernicus GmbH
Publication Date:
Publisher: Copernicus Gmbh, Göttingen, Germany
Member Page: Copernicus Publications


APA MLA Chicago

Beurivé, A., Boucher, O., Steels, L., Laguzet, F., Hanappe, P., Bellouin, N.,...Aina, T. (2011). Famous, Faster: Using Parallel Computing Techniques to Accelerate the Famous/Hadcm3 Climate Model with a Focus on the Radiative Transfer Algorithm : Volume 4, Issue 3 (27/09/2011). Retrieved from

Description: Sony Computer Science Laboratory, Paris, France. We have optimised the atmospheric radiation algorithm of the FAMOUS climate model on several hardware platforms. The optimisation involved translating the Fortran code to C and restructuring the algorithm around the computation of a single air column. Instead of the existing MPI-based domain decomposition, we used a task queue and a thread pool to schedule the computation of individual columns on the available processors. Finally, four air columns are packed together in a single data structure and computed simultaneously using Single Instruction Multiple Data operations.

The modified algorithm runs more than 50 times faster on the CELL's Synergistic Processing Element than on its main PowerPC processing element. On Intel-compatible processors, the new radiation code runs 4 times faster. On the tested graphics processor, using OpenCL, we find a speed-up of more than 2.5 times as compared to the original code on the main CPU. Because the radiation code takes more than 60 % of the total CPU time, FAMOUS executes more than twice as fast. Our version of the algorithm returns bit-wise identical results, which demonstrates the robustness of our approach. We estimate that this project required around two and a half man-years of work.

FAMOUS, faster: using parallel computing techniques to accelerate the FAMOUS/HadCM3 climate model with a focus on the radiative transfer algorithm

Asanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., Wessel, D., and Yelick, K.: A view of the parallel computing landscape, Communications of the ACM, 52, 56–67, doi:10.1145/1562764.1562783, 2009.; Backus, J.: Can Programming Be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs, Communications of the ACM, 21, 613–641, doi:10.1145/1283920.1283933, 1978.; Chafi, H., DeVito, Z., Moors, A., Rompf, T., Sujeeth, A. K., Hanrahan, P., Odersky, M., and Olukotun, K.: Language virtualization for heterogeneous parallel computing, in: OOPSLA '10 Proceedings of the ACM international conference on Object oriented programming systems languages and applications, pp. 835–847, ACM, New York, NY, USA, doi:10.1145/1869459.1869527, 2010.; Dersch, H.: Universal SIMD-Mathlibrary, Tech. rep., Furtwangen University of Applied Sciences, \urlprefix dersch/libsimdmath.pdf, 2008.; Patterson, D. A. and Hennesy, J. L.: Computer Organisation & Design: The Hardware/Software Interface, Morgan Kaufmann Publishers, second edn., 1997.; Easterbrook, S. M. and Johns, T.: Engineering the Software for Understanding Climate Change, IEEE Comput. Sci. Eng, 11, 65–74, doi:10.1109/MCSE.2009.193, 2009.; Edwards, J. and Slingo, A.: Studies with a flexible new radiation code. 1: Choosing a configuration for a large-scale model, Quart. J. Roy. Meteor. Soc., 122, 689–719, doi:10.1002/qj.49712253107, 1996.; Eichenberger, A. E., O'Brien, K., O'Brien, K., Wu, P., Chen, T., Oden, P. H., Prener, D. A., Shepherd, J. C., So, B., Sura, Z., Wang, A., Zhang, T., Zhao, P., and Gschwind, M.: Optimizing Compiler for the CELL Processor, in: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, 161–172, doi:10.1109/PACT.2005.33, 2005.; Fisher, R. J. and Dietz, H. G.: Compiling For SIMD Within A Register, in: 11th Annual Workshop on Languages and Compilers for Parallel Computing, 290–304, Springer Verlag, Chapel Hill, 1998.; Gschwind, M.: The {Cell Broadband Engine}: Exploiting multiple levels of parallelism in a chip multiprocessor, Int. J. Parallel Prog., 35, 233–262, doi:10.1007/s10766-007-0035-4, 2007.; Jones, C., Gregory, J., Thorpe, R., Cox, P., Murphy, J., Sexton, D., and Valdes, P.: Systematic optimisation and climate simulation of FAMOUS, a fast version of HadCM3, Clim. Dynam., 25, 189–204, doi:10.1007/s00382-005-0027-2, 2005.; Knight, C. G., Knight, S. H. E., Massey, N., Aina, T., Christensen, C., Frame, D. J., Kettleborough, J. A., Martin, A., Pascoe, S., Sanderson, B., Stainforth, D. A., and Allen, M. R.: Association of parameter, software and hardware variation with large scale behavior across 57,000 climate models, Proceedings of the National Academy of Sciences, 104, 12259–12264, doi:10.1073/pnas.0608144104, 2007.; Laguzet, F.: Analyse des performances du processeur CELL, Master's thesis, Institut d'Electronique Fondamentale, 2009.; Liao, C., Quinlan, D. J., Vuduc, R., and Panas, T.: Effective Source-to-Source Outlining to Support Whole Program Empirical O


Click To View

Additional Books

  • Implementation of an Optimal Stomatal Co... (by )
  • Δ18O Water Isotope in the ILoveclim Mode... (by )
  • Reallocation in Modal Aerosol Models: Im... (by )
  • Decadal Evaluation of Regional Climate, ... (by )
  • The North American Carbon Program Multi-... (by )
  • The Polar Vegetation Photosynthesis and ... (by )
  • Verifications of the High-resolution Num... (by )
  • Adism V.1.0: an Adjoint of a Thermomecha... (by )
  • Application of a Computationally Efficie... (by )
  • Ch4 Parameter Estimation in Clm4.5Bgc Us... (by )
  • Downscale Cascades in Tracer Transport T... (by )
  • Internally Generated Millennial-scale Cl... (by )
Scroll Left
Scroll Right


Copyright © World Library Foundation. All rights reserved. eBooks from National Public Library are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.