前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >第二弹!基于openmeteo 下载历史ERA5数据

第二弹!基于openmeteo 下载历史ERA5数据

作者头像
用户11172986
发布2024-06-20 19:05:51
1180
发布2024-06-20 19:05:51
举报
文章被收录于专栏:气python风雨

项目简介

近来大家伙被欧洲中心的数据下载困扰了很久,那么想办法另辟蹊径吧

本文基于旧文基于openmeteo 下载最新EC aifs预报数据进行简单修改

本项目旨在利用 OpenMeteo 平台提供的era5历史数据,进行气象数据分析和可视化。我们将使用 Python 编程语言和相关的气象数据处理工具来实现这一目标。

项目背景

气象预报数据对于天气预测、气候研究以及环境监测等方面具有重要意义,包含了全球范围内的多种气象要素预报,如温度、湿度、风速等。

项目目标

  1. 下载并获取era5历史数据。
  2. 实现循环多格点下载
  3. 对下载的预报表格数据进行处理,转为更加适合处理的xarray格式

结束语

通过本项目,我们希望能够深入探索气象数据的价值,并为气象爱好者、科研人员以及气象行业工作者提供有益的工具和资源。

更多数据下载可从 OPEN-METEO API (https://open-meteo.com/en%EF%BC%89 下载气象数据

那么我们开始吧

温馨提示

由于代码过长隐藏,可前往基于open-meteo下载历史ERA5数据多种气象要素点击运行查看 🔜🔜若没有成功加载可视化图,点击运行可以查看 ps:隐藏代码在【代码已被隐藏】所在行,点击所在行,可以看到该行的最右角,会出现个三角形,点击查看即可

安装库

代码语言:javascript
复制
代码语言:javascript
复制
!pip install --user openmeteo-requests
!pip install requests-cache retry-requests

单格点多种气象要素数据下载(官方示例)

代码语言:javascript
复制
代码语言:javascript
复制
import openmeteo_requests

import requests_cache
import pandas as pd
from retry_requests import retry

# Setup the Open-Meteo API client with cache and retry on error
cache_session = requests_cache.CachedSession('.cache', expire_after = -1)
retry_session = retry(cache_session, retries = 5, backoff_factor = 0.2)
openmeteo = openmeteo_requests.Client(session = retry_session)

# Make sure all required weather variables are listed here
# The order of variables in hourly or daily is important to assign them correctly below
url = "https://archive-api.open-meteo.com/v1/archive"
params = {
  "latitude": 52.52,
  "longitude": 13.41,
  "start_date": "2023-03-29",
  "end_date": "2023-04-01",
  "hourly":["temperature_2m", "relative_humidity_2m", "dew_point_2m", "surface_pressure", "wind_speed_10m"]
}
responses = openmeteo.weather_api(url, params=params)

# Process first location. Add a for-loop for multiple locations or weather models
response = responses[0]
print(f"Coordinates {response.Latitude()}°N {response.Longitude()}°E")
print(f"Elevation {response.Elevation()} m asl")
print(f"Timezone {response.Timezone()} {response.TimezoneAbbreviation()}")
print(f"Timezone difference to GMT+0 {response.UtcOffsetSeconds()} s")

# Process hourly data. The order of variables needs to be the same as requested.
hourly = response.Hourly()
hourly_temperature_2m = hourly.Variables(0).ValuesAsNumpy()
hourly_relative_humidity_2m = hourly.Variables(1).ValuesAsNumpy()
hourly_dew_point_2m = hourly.Variables(2).ValuesAsNumpy()
hourly_surface_pressure = hourly.Variables(3).ValuesAsNumpy()
hourly_wind_speed_10m = hourly.Variables(4).ValuesAsNumpy()

hourly_data = {"date": pd.date_range(
  start = pd.to_datetime(hourly.Time(), unit = "s", utc = True),
  end = pd.to_datetime(hourly.TimeEnd(), unit = "s", utc = True),
  freq = pd.Timedelta(seconds = hourly.Interval()),
  inclusive = "left"
)}
hourly_data["temperature_2m"] = hourly_temperature_2m
hourly_data["relative_humidity_2m"] = hourly_relative_humidity_2m
hourly_data["dew_point_2m"] = hourly_dew_point_2m
hourly_data["surface_pressure"] = hourly_surface_pressure
hourly_data["wind_speed_10m"] = hourly_wind_speed_10m

hourly_dataframe = pd.DataFrame(data = hourly_data)
print(hourly_dataframe.head(10))
代码语言:javascript
复制
代码语言:javascript
复制
Coordinates 52.5483283996582°N 13.407821655273438°E
Elevation 38.0 m asl
Timezone None None
Timezone difference to GMT+0 0 s
                       date  temperature_2m  relative_humidity_2m  \
0 2023-03-29 00:00:00+00:00          1.1085             81.883591   
1 2023-03-29 01:00:00+00:00          0.8085             83.363243   
2 2023-03-29 02:00:00+00:00          1.6085             78.999466   
3 2023-03-29 03:00:00+00:00          2.1085             75.114677   
4 2023-03-29 04:00:00+00:00          1.9085             74.802864   
5 2023-03-29 05:00:00+00:00          1.0085             77.748978   
6 2023-03-29 06:00:00+00:00          1.9585             75.365379   
7 2023-03-29 07:00:00+00:00          3.9085             71.423698   
8 2023-03-29 08:00:00+00:00          5.9585             67.313850   
9 2023-03-29 09:00:00+00:00          7.7585             65.560349   

   dew_point_2m  surface_pressure  wind_speed_10m  
0       -1.6415       1016.877441        9.007196  
1       -1.6915       1016.175476        9.885262  
2       -1.6415       1015.393250       11.525623  
3       -1.8415       1014.605896       13.009903  
4       -2.0915       1014.204285       12.646200  
5       -2.4415       1014.088867       11.520000  
6       -1.9415       1013.707214       13.138765  
7       -0.7915       1013.840393       12.224107  
8        0.3585       1013.477295       14.007655  
9        1.7085       1013.009827       14.168641

循环下载多格点多种气象要素

In [25]:

代码语言:javascript
复制
代码语言:javascript
复制
import openmeteo_requests
import requests_cache
import pandas as pd
from retry_requests import retry
import numpy as np

# 设置Open-Meteo API客户端并进行缓存和错误重试
cache_session = requests_cache.CachedSession('.cache', expire_after=3600)
retry_session = retry(cache_session, retries=5, backoff_factor=0.2)
openmeteo = openmeteo_requests.Client(session=retry_session)

# 生成要下载数据的格点经纬度范围
latitude_points = np.arange(52.0, 53.1, 0.25)
longitude_points = np.arange(13.0, 14.1, 0.25)

# 定义要查询的变量和模型
variables = ["temperature_2m", "relative_humidity_2m", "dew_point_2m", "surface_pressure", "wind_speed_10m"]

# 创建一个空列表,用于存储每个格点的数据
dataframes = []

# 循环遍历每个格点并下载数据
for lat in latitude_points:
    for lon in longitude_points:
        url = "https://archive-api.open-meteo.com/v1/archive"
        params = {
            "latitude": lat,
            "longitude": lon,
            "start_date": "2023-03-29",
          "end_date": "2023-04-01",
            "hourly": ",".join(variables),
        }
        responses = openmeteo.weather_api(url, params=params)

        # 处理响应数据
        for response in responses:
            hourly = response.Hourly()
            hourly_data = {
                "date": pd.date_range(
                    start=pd.to_datetime(hourly.Time(), unit="s", utc=True),
                    end=pd.to_datetime(hourly.TimeEnd(), unit="s", utc=True),
                    freq=pd.Timedelta(seconds=hourly.Interval()),
                    inclusive="left"
                )
            }

            for i, variable in enumerate(variables):
                variable_data = hourly.Variables(i).ValuesAsNumpy()
                hourly_data[variable] = variable_data

            hourly_dataframe = pd.DataFrame(data=hourly_data)
            dataframes.append(hourly_dataframe)

# 将所有数据拼接成一个表格
final_dataframe = pd.concat(dataframes, ignore_index=True)

# 打印最终的数据表格
print(final_dataframe)
代码语言:javascript
复制
代码语言:javascript
复制
                          date  temperature_2m  relative_humidity_2m  \
0    2023-03-29 00:00:00+00:00          1.6220             76.708954   
1    2023-03-29 01:00:00+00:00          1.6720             76.153656   
2    2023-03-29 02:00:00+00:00          1.6220             76.145111   
3    2023-03-29 03:00:00+00:00          1.6220             74.200531   
4    2023-03-29 04:00:00+00:00          1.2720             72.501663   
...                        ...             ...                   ...   
2395 2023-04-01 19:00:00+00:00          3.7545             77.622520   
2396 2023-04-01 20:00:00+00:00          3.0045             79.500145   
2397 2023-04-01 21:00:00+00:00          2.6545             79.158913   
2398 2023-04-01 22:00:00+00:00          2.3545             78.824844   
2399 2023-04-01 23:00:00+00:00          2.2045             77.940964   

      dew_point_2m  surface_pressure  wind_speed_10m  
0          -2.0280       1008.537415       11.013882  
1          -2.0780       1007.848816       12.074766  
2          -2.1280       1007.155701       13.339445  
3          -2.4780       1006.563538       13.004921  
4          -3.1280       1006.151733       12.682018  
...            ...               ...             ...  
2395        0.2045       1002.548828       22.267679  
2396       -0.1955       1003.427124       21.385939  
2397       -0.5955       1004.413757       21.758419  
2398       -0.9455       1005.302063       22.264771  
2399       -1.2455       1006.094055       20.929594  

[2400 rows x 6 columns]

增加经纬度

In [26]:

代码语言:javascript
复制
代码语言:javascript
复制
%%time
import openmeteo_requests
import requests_cache
import pandas as pd
from retry_requests import retry
import numpy as np

# 设置Open-Meteo API客户端并进行缓存和错误重试
cache_session = requests_cache.CachedSession('.cache', expire_after=3600)
retry_session = retry(cache_session, retries=5, backoff_factor=0.2)
openmeteo = openmeteo_requests.Client(session=retry_session)

# 生成要下载数据的格点经纬度范围
latitude_points = np.arange(45.0, 50.1, 0.25)
longitude_points = np.arange(115.0, 120.1, 0.25)

# 定义要查询的变量和模型
variables = ["temperature_2m", "relative_humidity_2m", "dew_point_2m", "surface_pressure", "wind_speed_10m"]
#model = "ecmwf_aifs025"

# 创建一个空列表,用于存储每个格点的数据
dataframes = []

# 循环遍历每个格点并下载数据
for lat in latitude_points:
    for lon in longitude_points:
        url = "https://archive-api.open-meteo.com/v1/archive"
        params = {
            "latitude": lat,
            "longitude": lon,
            "start_date": "2023-03-29",
          "end_date": "2023-04-01",
            "hourly": ",".join(variables),
        }
        responses = openmeteo.weather_api(url, params=params)

        # 处理响应数据
        for response in responses:
            hourly = response.Hourly()
            hourly_data = {
                "date": pd.date_range(
                    start=pd.to_datetime(hourly.Time(), unit="s", utc=True),
                    end=pd.to_datetime(hourly.TimeEnd(), unit="s", utc=True),
                    freq=pd.Timedelta(seconds=hourly.Interval()),
                    inclusive="left"
                ),
                "latitude": lat,
                "longitude": lon
            }

            for i, variable in enumerate(variables):
                variable_data = hourly.Variables(i).ValuesAsNumpy()
                hourly_data[variable] = variable_data

            hourly_dataframe = pd.DataFrame(data=hourly_data)
            dataframes.append(hourly_dataframe)

# 将所有数据拼接成一个表格
final_dataframe = pd.concat(dataframes, ignore_index=True)

# 打印最终的数据表格
print(final_dataframe)
代码语言:javascript
复制
代码语言:javascript
复制
                           date  latitude  longitude  temperature_2m  \
0     2023-03-29 00:00:00+00:00      45.0      115.0          3.3635   
1     2023-03-29 01:00:00+00:00      45.0      115.0          6.2635   
2     2023-03-29 02:00:00+00:00      45.0      115.0          8.6135   
3     2023-03-29 03:00:00+00:00      45.0      115.0         10.1635   
4     2023-03-29 04:00:00+00:00      45.0      115.0         11.1135   
...                         ...       ...        ...             ...   
42331 2023-04-01 19:00:00+00:00      50.0      120.0         -1.1890   
42332 2023-04-01 20:00:00+00:00      50.0      120.0         -1.7390   
42333 2023-04-01 21:00:00+00:00      50.0      120.0         -2.3890   
42334 2023-04-01 22:00:00+00:00      50.0      120.0         -2.5390   
42335 2023-04-01 23:00:00+00:00      50.0      120.0         -1.0890   

       relative_humidity_2m  dew_point_2m  surface_pressure  wind_speed_10m  
0                 25.305822    -14.686501        860.471863       16.075521  
1                 20.003267    -15.086500        862.864014       14.529915  
2                 16.967871    -15.136500        863.092651       13.183080  
3                 16.461998    -14.236501        863.084045        9.565437  
4                 15.966474    -13.836500        863.203491        7.895416  
...                     ...           ...               ...             ...  
42331             80.960815     -4.039000        921.916138        9.346143  
42332             74.988991     -5.589000        922.102844        9.826088  
42333             71.234772     -6.889000        922.800964        9.422101  
42334             69.578217     -7.339000        923.115784       10.594036  
42335             64.489380     -6.939000        924.129761       11.525623  

[42336 rows x 8 columns]
CPU times: user 2.44 s, sys: 248 ms, total: 2.69 s
Wall time: 2min 17s

以上是5°×5°的空间范围,时间范围为4天的数据,耗时两分多钟

并行下载优化版(代码已隐藏)

并行下载大大加快了下载速度

但是

请求格点太多,会被警告下个小时才可使用

大家可摸索一下较为合适的范围

转为xarray格式并简单可视化

In [28]:

代码语言:javascript
复制
代码语言:javascript
复制
# 修正时间数据类型为datetime
final_dataframe['date'] = pd.to_datetime(final_dataframe['date']).dt.tz_localize(None)

# 将DataFrame转换为xarray Dataset
ds = final_dataframe.set_index(['date', 'latitude', 'longitude']).to_xarray()

# 打印xarray Dataset
print(ds)
代码语言:javascript
复制
代码语言:javascript
复制
<xarray.Dataset>
Dimensions:               (date: 96, latitude: 21, longitude: 21)
Coordinates:
  * date                  (date) datetime64[ns] 2023-03-29 ... 2023-04-01T23:...
  * latitude              (latitude) float64 45.0 45.25 45.5 ... 49.5 49.75 50.0
  * longitude             (longitude) float64 115.0 115.2 115.5 ... 119.8 120.0
Data variables:
    temperature_2m        (date, latitude, longitude) float32 3.364 ... -1.089
    relative_humidity_2m  (date, latitude, longitude) float32 25.31 ... 64.49
    dew_point_2m          (date, latitude, longitude) float32 -14.69 ... -6.939
    surface_pressure      (date, latitude, longitude) float32 860.5 ... 924.1
    wind_speed_10m        (date, latitude, longitude) float32 16.08 ... 11.53

In [37]:

代码语言:javascript
复制
代码语言:javascript
复制
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature
from meteva.base.tool.plot_tools import add_china_map_2basemap

# 选择要绘制的时间步长(这里为第10个)
rh_data =ds.relative_humidity_2m[10]

# 创建一个新的 figure,并设置地图投影
fig, ax = plt.subplots(figsize=(10, 6), subplot_kw={"projection": ccrs.PlateCarree()})



# 绘制相对湿度数据
rh_data.plot(ax=ax, transform=ccrs.PlateCarree(), cmap='YlGnBu_r', vmin=0, vmax=100, zorder=3)
add_china_map_2basemap(ax, name="province", edgecolor='k', lw=0.5, encoding='gbk', grid0=None)  # "
# 设置坐标轴标签和标题
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
ax.set_title(f"Relative Humidity at 2m (Time Step {time_index})")

plt.show()
代码语言:javascript
复制
代码语言:javascript
复制

学习了以上方法,想必可以解大家获取ERA5数据的燃眉之急 还需注意的是大家不能一次性获取大量数据,不然会收到api限制的警告, open-meteo对每分钟内的请求次数进行了限制。

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2024-04-15,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 气python风雨 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 项目简介
  • 项目背景
  • 项目目标
  • 结束语
  • 温馨提示
  • 安装库
  • 单格点多种气象要素数据下载(官方示例)
  • 循环下载多格点多种气象要素
  • 增加经纬度
  • 并行下载优化版(代码已隐藏)
  • 转为xarray格式并简单可视化
相关产品与服务
腾讯云 BI
腾讯云 BI(Business Intelligence,BI)提供从数据源接入、数据建模到数据可视化分析全流程的BI能力,帮助经营者快速获取决策数据依据。系统采用敏捷自助式设计,使用者仅需通过简单拖拽即可完成原本复杂的报表开发过程,并支持报表的分享、推送等企业协作场景。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档