CW3E Publication Notice

Global Daily Discharge Estimation Based on Grid Long Short-Term Memory (LSTM) Model and River Routing

July 24, 2025

A paper titled “Global Daily Discharge Estimation Based on Grid Long Short-Term Memory (LSTM) Model and River Routing” was recently published in the AGU’s Water Resources Research. This study was led by Yuan Yang (CW3E), with contributions from a wide network of collaborating institutions. This work supports CW3E’s Advanced Precipitation and Streamflow Prediction priority described in the CW3E 2025-2029 Strategic Plan. This work was sponsored by NASA Energy and Water Cycle Study Program (NEWS) and NOAA Cooperative Institute for Research to Operations In Hydrology (CIROH) project.

Accurate global river discharge estimation is critical for applications in water resources, climate change, natural hazards, biodiversity, and energy production. Recent advances in machine learning, particularly Long Short-Term Memory (LSTM) networks, have shown strong potential in this domain. However, existing LSTM-based approaches typically treat each basin as a hydrological response unit (HRU) and rely on basin-averaged inputs to estimate discharge only at basin outlets. This approach has several limitations: 1) high computational cost when scaling to the global river network, 2) loss of spatial heterogeneity within large basins, and 3) lack of physical consistency across HRUs, such as upstream-downstream mass balance and temporal coherence, due to the absence of explicit river routing.

To address these challenges, we developed a new modeling scheme, Grid LSTM‐RAPID, to estimate discharge for every river reach worldwide. The framework consists of three steps (Figure 1):

  • Step 1, LSTM training over small basins: Train a single LSTM model over selected training basins.
  • Step 2, LSTM application over 0.25° grids: Apply the trained LSTM obtained in Step 1 over global 0.25° grids, using the gridded inputs, to generate global gridded runoff.
  • Step 3, RAPID routing: Implement the RAPID model to calculate the discharge for all reaches globally on the MERIT-Basins river network based on the gridded runoff from Step 2. RAPID uses a vector-matrix version of the Muskingum method and is well parallelized for large-scale applications. RAPID setup details can be found in Yang et al. (2021).

Figure 1. Grid LSTM‐RAPID daily discharge modeling framework. The basins, grids, and river networks depicted are simplified illustrations and do not represent their actual size, shape, or quantity. From Figure 2 in Yang et al. (2025).

Extensive evaluations against daily flow records from about 30,000 gauges globally demonstrate that our global LSTM implementation and supporting data work reasonably well. The model’s generalizability across time and space, including unseen regions and periods, has been thoroughly assessed. Compared to the traditional Basin LSTM, Grid LSTM‐RAPID model shows only a slight reduction in performance, while enabling global, reach-level discharge estimation without heavy computational cost. Despite this tradeoff, Grid LSTM-RAPID significantly outperforms a well‐calibrated process‐based benchmark.

Based on this framework, we developed a global reach‐level daily discharge dataset, named GRADES-hydroDL, covering 2.94 million river reaches globally from 1980 to near present. This dataset reproduces the global discharge every well (Figure 2) and offers valuable support for applications such as flood assessment, water resources management, and ecological studies. To facilitate broader use, we also developed an interactive interface (Figure 3), allowing users of all experience levels to explore any river, view local features and download data such as discharge, watershed boundaries and river networks.

The dataset is openly available at https://www.reachhydro.org/home/records/grades-hydrodl. The interactive Interface is accessible at: https://cw3e.ucsd.edu/hydro/grades_hydrodl.

Figure 2. Skill metrics over non-training basins for GRADES-hydroDL during the period of 1980-2020: (a) Kling-Gupta efficiency (KGE, for overall performance), (b) correlation coefficient (CC, for temporal coherence), (c) relative variability (RV, for bias in variability) and relative bias (RB, for bias in magnitude). From Figure 8 in Yang et al. (2025).

Figure 3. Selected functionalities of GRADES-hydroDL interactive interface: (a) Basic information for a selected river, (b) river networks and terrain visualization for a specific region, (c) interactive time series data of river discharge spanning the current two years and past 45 years, (d) delineation of water boundaries, upstream river network and downstream flow path associated with a selected gauge location.

Yang, Y., Feng, D., Beck, H.E., Hu, W., Ather, A., Sengupta, A., Delle Monache, L., Hartman, R., Lin, P., Shen, C. & Pan, M. (2025). Global Daily Discharge Estimation Based on Grid Long Short-Term Memory (LSTM) Model and River Routing. Water Resources Research, 61(6), e2024WR039764. https://doi.org/10.1029/2024WR039764

Yang, Y., Pan, M., Lin, P., Beck, H. E., Zeng, Z., Yamazaki, D., David, C. H., Lu, H., Yang, K., Hong, Y., & Wood, E. F. (2021). Global Reach-Level 3-Hourly River Flood Reanalysis (1980–2019). Bulletin of the American Meteorological Society, 102(11), E2086-E2105. https://doi.org/10.1175/BAMS-D-20-0057.1