High-Order-Preserving Acceleration of a Shallow-Water Dynamical Core Using Tensor Units
编号:291
访问权限:仅限参会人
更新:2026-03-27 17:16:41 浏览:21次
口头报告
摘要
Recent advances in low-precision computing units such as Tensor Cores offer enormous performance opportunities for scientific computing beyond machine learning workloads. High-order numerical methods feature rapid convergence and enhanced approximation capability, but a direct application of low precision often degrades numerical accuracy, preventing these methods from achieving their expected order of convergence. This work addresses the challenge by restoring the effective working precision, tunable up to FP64, using FP16 Tensor Cores in the convolution-reformulated stencil computations of a high-order shallow-water dynamical core, HOPE, through the Ozaki scheme combined with an efficient memory layout to optimize Tensor Core utilization. The proposed approach achieves an end-to-end speedup of 1.82× on average and up to 4.34× while maintaining the required accuracy across numerical algorithms of different orders and various resolutions, demonstrating the viability of leveraging low-precision hardware even in accuracy sensitive high-order PDE solvers.
关键词
Mixed-Precision Computing,Dynamical Core,hHgh-Order Numerical Method,Ozaki Scheme
稿件作者
姚杰男
清华大学
周立隆
清华大学;中国气象局地球系统数值预报中心
李恒
清华大学
薛巍
清华大学
发表评论