1. This dataset addresses the need for high-precision meteorological drivers in the complex terrain regions of the Yangtze and Yellow River headwaters on the Qinghai-Tibet Plateau. It provides daily 2-meter near-surface air temperature data in NC format at the regional scale for the Yangtze and Yellow River headwaters, with a spatial resolution of 4.0 km (approximately 0.03333°). The temporal range covers January 1, 1960, to December 31, 1979.The study site is located in the heart of the Qinghai-Tibet Plateau, known as the “Water Tower of China,” primarily distributed in the Tangula and Bayan Har mountain ranges of Qinghai Province. This region has an average elevation of approximately 4,500 meters, featuring a cold, arid climate with widespread glaciers and permafrost. The Yellow River headwaters region lies upstream of the Longyangxia Reservoir on the main stem of the Yellow River. It is concentrated in the Yoguzonglie Basin on the northern foothills of the Bayankala Mountains and near Lake Zhaling and Lake Eling, with geographical coordinates between 95°55'E-98°41'E and 33°56'N-35°31'N. It exhibits a typical continental plateau climate, with an average annual temperature ranging from approximately -3°C to -4.1°C and annual precipitation typically between 300–700 mm. The Yangtze River source region, demarcated by the Zhimen Da hydrological station, lies between the Tanggula and Kunlun Mountains. Its boundaries span 90°43′E-97°45′E and 32°30′N-36°35′N. The overall climate is dry, cold, and low in precipitation, with an average annual temperature of -1.7°C to -5.5°C and annual precipitation of approximately 270–410 mm.
The development process is as follows: First, a two-year (1960 and 1961) WRF simulation with a spatial resolution of 1/30° was conducted. Second, a downscaling model based on a convolutional neural network (CNN) was trained at the daily scale using the WRF simulation results. This downscaling model comprises four convolutional layers (for feature extraction) and one subpixel convolution layer (for constructing high-resolution data). Model inputs include coarse-resolution air temperature data, coarse-resolution terrain data (i.e., grid-based elevation and elevation standard deviation), and high-resolution terrain data. The output is high-resolution meteorological data. The trained model is then applied to downscale long-term ERA5 reanalysis data, generating high-resolution (1/30°≈4 km) gridded air temperature data (ERA5_CNN). Cumulative distribution function (CDF) bias correction is implemented using source region station observations, and the final regional product is cropped to the boundaries of the Yangtze-Yellow River source area. Data strictly adheres to CF-1.8 / ACDD-1.3 metadata specifications, provided as NetCDF-4 format files. This format facilitates Python scientific computing and enables direct loading in ArcGIS Pro.
2. Data Content and Elements: Variable: T_2m(time, lat, lon) — 2 m daily mean air temperature (°C), standard_name=air_temperature; associated scalar coordinate height=2.0 m; grid_mapping=crs(WGS84).
3. Spatial-Temporal Extent: Longitude 90.547953–103.413333°E, Latitude 32.148290–36.114560°N; Time 1960-01-01 to 1979-12-31 (including leap years).
4. Resolution: Spatial 0.03333° × 0.03333° (approx. 4 km), Temporal Daily.
5. Naming convention: Daily file ERA5_CNN_t2m_4km_daily_YYYYMMDD.nc (time=1).
6. Coordinates: lat/degN, lon/degE in ascending order; CRS: EPSG:4326.
7. Production Background and Method Overview: The Qinghai-Tibet Plateau exhibits significant topographic variations and distinct surface conditions, making it challenging for conventional 0.25° reanalysis or simple interpolation methods to capture local thermodynamic-topographic effects. This dataset employs WRF short-term high-resolution simulations to learn the “high-resolution to low-resolution” mapping relationship, using CNN to capture nonlinear spatial features for statistical downscaling of ERA5 across all time periods. Subsequently, CDF corrections are applied using source region station observations (site-scale mapping constructed and IDW spatially interpolated), significantly reducing high-altitude cold biases and enhancing station consistency.
8. Advantages and Features: ① Higher spatial resolution with terrain detail preservation (≈4 km vs 0.25°); ② Combination of physical prior (dynamic downscaling) + deep learning (statistical downscaling) outperforms interpolation-only spatial sharpening; ③ Improved station consistency after bias correction, yielding more reliable extreme events and interannual variability;
9. Applications: Permafrost/active layer thermal status assessment, frost-thaw index and N-factor calculation, watershed hydrological and ecological modeling, regional climate change detection, surface process simulation, and disaster risk assessment.Translated with DeepL.com (free version)
| collect time | 1960/01/01 - 1979/12/31 |
|---|---|
| collect place | The latitudinal and longitudinal ranges of the Yangtze and Yellow River headwaters are approximately 32°N–36°N and 89°E–103°E. |
| altitude | 2662.0m - 6479.0m |
| data size | 754.1 MiB |
| data format | |
| Coordinate system | WGS84 |
| Projection |
1. Reanalysis Data (Background Field) ERA5 2 m Temperature (ECMWF/Copernicus) features extensive spatiotemporal coverage and advanced assimilation schemes, providing large-scale daily/hourly near-surface meteorological fields. With native resolution of approximately 0.25° at hourly intervals, this dataset is aggregated to daily averages in this work as one of the CNN input features.
2. Short-term high-resolution atmospheric simulation (high-resolution features) WRF (ARW) short-term simulations cover the full year of 1960 and the critical season of June–September 1961, providing high-resolution structural information on near-surface temperature and topography. This data is used to construct target-scale (kilometer-level) temperature “training” samples for constructing CNNs to learn low-high resolution mapping relationships and spatial features under complex terrain.
3. Station Observations (Correction and Validation) Used for independent validation and uncertainty assessment (daily conventional meteorological observations from the China Meteorological Administration (CMA)), sourced from 29 stations within and adjacent to the source region (15 stations with data starting from 1960). Applications include: a) Training/validation set partitioning and cross-validation; b) Statistical mapping construction for CDF bias correction; c) Independent evaluation (RMSE/bias/correlation coefficients, etc.)
4. Other Auxiliary Data 4 km DEM (terrain standard deviation, etc., usable as static feature input for CNN models);
Preprocessing: Clipping ERA5 data with 0.25° resolution for the background field to extract a rectangular area encompassing the Yangtze and Yellow River source regions; unifying to WGS84 grid; outlier removal and basic quality control.
1. Structure and Specifications: The file contains complete coordinates, units, and global metadata.
2. Validation and Uncertainty: ① The background field data used to develop the dataset—0.25° ERA5 daily mean temperature data—was interpolated to 4 km resolution using the most common bilinear interpolation method. This serves as the baseline for method comparison. ② Subsequently, two datasets were validated against each other using 29 independent stations within the region (using metrics such as RMSE/MAE/Bias/correlation). The deep learning downscaling product ERA5_CNN significantly reduced systematic cold bias and mean squared error compared to bilinear interpolation ERA5_BLI (regional average Bias decreased from -2.74 ℃ to -0.01 ℃, RMSE decreased by over 50%, from 4.48°C to 2.04°C), while the correlation coefficient (CC) at each station generally remained above 0.9.
In summary, the CNN downscaling method significantly outperforms simple bilinear interpolation in reproducing daily temperatures within complex terrain in source regions, yielding results closer to observations. Some limitations exist: minor systematic biases may occur under extreme cold or high wind conditions. Sources of uncertainty include biases inherent in the reanalysis itself, CNN regression toward the mean for extreme values, and local representativeness issues due to sparse station coverage.
| # | number | name | type |
| 1 | 2023YFC3206300 | The evolution of cryosphere elements and their impact on the water resources of the Yangtze River and the Yellow River | National key R & D plan |
This work is licensed under a
Creative
Commons Attribution 4.0 International License.
| # | title | file size |
|---|---|---|
| 1 | _ncdc_meta_.json | 13.0 KiB |
| 2 | ERA5_CNN(1960-1979) |
| # | category | title | author | year |
|---|---|---|---|---|
| 1 | paper | Development of long-term spatiotemporal continuous NDVI products for alpine grassland from 1982 to 2020 in the Qinghai–Tibet Plateau, China. Grassland Research, 3(2), 100–112. | Yang Xiali, Huang Xiaodong, Ma Ying, Li Yuxin, Feng Qisheng, Liang Tiangang | 2024 |
Air temperature Yangtze River source region Yellow River source region WRF mesoscale meteorological model deep learning CNN
1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979
©Copyright 2005-. Northwest Institute of Eco-Environment and Resources, CAS.
Donggang West Road 320, Lanzhou, Gansu, China (730000)

