Accurate high-resolution retrieval of near-surface air pollutants from satellite observations remains a major challenge in complex coastal megaregions, where strong spatial heterogeneity, nonlinear meteorological influences, and dynamic transport processes limit the performance of conventional statistical and purely data-driven models. These limitations are further exacerbated during extreme weather events, when pollutant distributions deviate substantially from climatological patterns. The Guangdong–Hong Kong–Macao Greater Bay Area (GBA) provides a representative testbed for advancing remote sensing–based air quality retrieval under compound meteorological conditions.We propose a physics-informed spatiotemporal graph convolutional network (PI-STGCN) for high-resolution retrieval of near-surface atmospheric components, including NO₂ and PM₂.₅, by integrating multi-source satellite remote sensing (Sentinel-5P/TROPOMI), reanalysis meteorological fields (ERA5), land-use information, and emission proxies. The study domain is discretized into 500 m × 500 m grids, each treated as a graph node, while edges are dynamically constructed based on spatial adjacency and wind-direction-dependent transport to explicitly represent pollutant advection and diffusion processes.
A key innovation of PI-STGCN lies in embedding outputs from the WRF-Chem chemical transport model as soft physical constraints in the loss function, jointly optimizing observation fidelity and physical consistency. Model performance is systematically benchmarked against multiple linear regression (MLR) and long short-term memory (LSTM) networks using independent ground-based observations and evaluated with RMSE, MAE, and coefficient of determination (R²).PI-STGCN demonstrates substantially improved retrieval accuracy and spatial coherence compared to baseline models, particularly under extreme meteorological conditions characterized by heatwaves and stagnant atmospheric circulation. The incorporation of physics-informed constraints effectively reduces physically implausible predictions and enhances model robustness across seasons. High-resolution hourly retrievals reveal pronounced spatial gradients associated with coastal circulation patterns and urban emission hotspots that are poorly captured by coarser-resolution approaches.This study presents a transferable remote sensing and machine learning framework that bridges data-driven inference and atmospheric physical knowledge for air quality retrieval in heterogeneous environments. The proposed PI-STGCN provides a scalable solution for next-generation satellite observations and lays a methodological foundation for integrated environmental risk assessment under extreme and non-stationary climate conditions.