在气象与气候研究领域,WRF(Weather Research and Forecasting)模型生成的数据集因其高分辨率和丰富的气象变量而被广泛应用于科研与业务预报中。然而,面对这些庞大数据集时,高效且优雅地进行数据索引与提取往往成为数据分析流程中的关键一环。这不仅关乎研究效率,更直接影响到我们对气象现象理解的深度与广度。
本篇学习笔记,旨在探讨如何利用Python中的where
函数这一强大工具,实现对WRF输出数据的高效索引与筛选。where
函数作为一个条件索引神器,它允许我们在不修改原数据结构的前提下,灵活地根据预设条件定位到数据集中的特定部分,这对于处理多维度、大规模的WRF数据尤为重要。
我们将从以下几个方面展开:
where
函数基础:简要回顾where
函数的基本用法,理解其在条件筛选中的核心作用。xarray
或netCDF4
等库来便捷地加载与操作这些数据。where
函数针对WRF数据中的特定时间切片、空间区域、气象变量阈值等进行精确索引。这包括但不限于选取特定天气事件、分析特定高度或层次的大气参数等场景。where
函数在更复杂数据分析任务中的应用,比如结合绘图库进行条件可视化。无论您是气象学领域的研究人员,还是对WRF数据处理感兴趣的开发者,希望通过这篇笔记,能够让您掌握基于where
函数的高效数据索引技能,使您的WRF数据探索之旅变得更加流畅与高效。
首先假设我们需要索引文件中3km到11km的垂直速度
where函数是Python数据处理中的一个多功能工具,特别是在处理数组和数据集时。它允许用户根据条件选择性地保留或替换数组中的元素。在numpy, pandas, 以及我们讨论重点——xarray库中,where函数的核心作用是根据布尔数组(或条件表达式)来过滤数据,类似于SQL中的WHERE子句。
基本语法:
result = xarray_object.where(condition, other=value, drop=False)
condition: 一个布尔数组或表达式,决定哪些元素被保留或替换。 other: 当条件为False时,用于替换的值,默认为NaN。 drop: 特别在xarray中,决定是否删除变为全NaN的坐标维度。
In [1]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.cm import get_cmap
import cartopy.crs as crs
from cartopy.feature import NaturalEarthFeature
from netCDF4 import Dataset
from wrf import to_np, getvar, CoordPair, vertcross
In [2]:
# Open the NetCDF file
filename = "/home/mw/input/typhoon9537/wrfout_d01_2019-08-09_03_00_00"
ncfile = Dataset(filename)
# z是海拔高度,wa是垂直速度
z = getvar(ncfile, "z") # 确保这是海拔高度或可以转换为海拔高度的数据
wa = getvar(ncfile, "wa")
# 定义高度范围
height_min = 3000 # 3km
height_max = 11000 # 11km
# 找到满足条件的高度索引
idx = np.where((z.data >= height_min) & (z.data <= height_max))
# 提取该高度范围内的垂直速度
wa_3km_to_11km = wa.data[idx]
# 记得关闭文件
ncfile.close()
print(wa_3km_to_11km.shape)
(2200531,)
In [16]:
wa_3km_to_11km
Out[16]:
array([ 0.24370739, -0.81917095, -1.931387 , ..., 0.16037607,
0.11253764, 0.11774951], dtype=float32)
傻眼了,出来的一维数组怎么处理
既然整层不行那就拆分处理吧
In [5]:
z.data[:,0,0] <= 11000
Out[5]:
array([ True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, False, False, False, False,
False, False, False, False, False, False, False])
In [8]:
(z.data[:,0,0] >= 3000)
Out[8]:
array([False, False, False, False, False, False, False, False, False,
False, False, False, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True])
In [9]:
w_above_11km = wa.data[(z.data[:,0,0] >= height_min)&(z.data[:,0,0] <= 11000), :, :]
w_above_11km.shape
Out[9]:
(11, 437, 447)
In [17]:
w_above_11km
Out[17]:
array([[[ 2.57072374e-02, 2.57084221e-02, 1.39461048e-02, ...,
5.86005747e-02, 7.42593408e-02, 7.42520988e-02],
[ 2.57065967e-02, 2.57090535e-02, 1.39477691e-02, ...,
5.85796982e-02, 7.42296576e-02, 7.42505118e-02],
[ 2.24057492e-02, 2.24061050e-02, 1.48067437e-02, ...,
2.87155751e-02, 4.62430194e-02, 4.62519228e-02],
...,
[-2.59022545e-02, -2.59996206e-02, -1.42764896e-02, ...,
2.08282083e-01, 1.42168552e-01, 1.64558977e-01],
[-3.33420001e-02, -3.32461372e-02, -2.85161454e-02, ...,
1.78012952e-01, 1.28333122e-01, 1.75490454e-01],
[-3.32230292e-02, -3.31570022e-02, -2.85653044e-02, ...,
1.72306970e-01, 1.39109194e-01, 1.96686685e-01]],
[[ 1.95174385e-02, 1.95181239e-02, 6.36351109e-03, ...,
8.23565647e-02, 9.88466516e-02, 9.88370925e-02],
[ 1.95158906e-02, 1.95220541e-02, 6.37008063e-03, ...,
8.23225230e-02, 9.88071561e-02, 9.88344252e-02],
[ 1.18577341e-02, 1.18539724e-02, 4.54341201e-03, ...,
4.92160693e-02, 6.86740801e-02, 6.86902404e-02],
...,
[-3.99552174e-02, -4.00430337e-02, -3.57519016e-02, ...,
1.67332381e-01, 1.43158346e-01, 1.62913084e-01],
[-3.87348309e-02, -3.86548266e-02, -4.35788371e-02, ...,
1.41215518e-01, 1.22055411e-01, 1.62635326e-01],
[-3.86345573e-02, -3.85769606e-02, -4.36166972e-02, ...,
1.36176065e-01, 1.31297901e-01, 1.80644274e-01]],
[[ 6.59154030e-03, 6.59158314e-03, -8.10191035e-03, ...,
1.05053291e-01, 1.20272815e-01, 1.20261833e-01],
[ 6.59009768e-03, 6.59585698e-03, -8.09310097e-03, ...,
1.05005577e-01, 1.20225564e-01, 1.20258242e-01],
[-6.75230194e-03, -6.75604679e-03, -1.05384765e-02, ...,
6.96224868e-02, 8.89233500e-02, 8.89445394e-02],
...,
[-4.66218218e-02, -4.66989353e-02, -5.07004037e-02, ...,
1.35923386e-01, 1.39581129e-01, 1.56487793e-01],
[-3.57712843e-02, -3.57054248e-02, -5.11264950e-02, ...,
1.17605835e-01, 1.18425503e-01, 1.52706578e-01],
[-3.56875099e-02, -3.56389545e-02, -5.11543639e-02, ...,
1.13287851e-01, 1.26217872e-01, 1.67763382e-01]],
...,
[[-4.47849296e-02, -4.47855368e-02, -6.30332753e-02, ...,
4.29634079e-02, 3.57846543e-02, 3.57831791e-02],
[-4.47816662e-02, -4.47865017e-02, -6.30316734e-02, ...,
4.29388434e-02, 3.57698388e-02, 3.57816070e-02],
[-6.73626959e-02, -6.73594549e-02, -4.77313846e-02, ...,
3.24938931e-02, 5.03025576e-02, 5.03171906e-02],
...,
[-1.08916372e-01, -1.08947843e-01, -1.27490088e-01, ...,
8.06099400e-02, 1.63134456e-01, 1.67818889e-01],
[-6.99967891e-02, -6.99824393e-02, -1.12165108e-01, ...,
1.21767908e-01, 1.38440162e-01, 1.47194624e-01],
[-6.99767619e-02, -6.99620694e-02, -1.12159580e-01, ...,
1.20651603e-01, 1.40502006e-01, 1.51199013e-01]],
[[-1.58361830e-02, -1.58361737e-02, -4.20171767e-02, ...,
3.23081249e-03, -1.08113363e-02, -1.08120069e-02],
[-1.58345737e-02, -1.58375446e-02, -4.20166217e-02, ...,
3.21588945e-03, -1.08177215e-02, -1.08104404e-02],
[-2.73063779e-02, -2.73035020e-02, -3.15086991e-02, ...,
1.01064956e-02, 1.85807515e-02, 1.85920112e-02],
...,
[-1.06948748e-01, -1.06970586e-01, -1.27894104e-01, ...,
9.86869857e-02, 1.38150871e-01, 1.41195774e-01],
[-6.52732030e-02, -6.52623773e-02, -1.07937858e-01, ...,
1.62120432e-01, 1.33735448e-01, 1.39789194e-01],
[-6.52577281e-02, -6.52468204e-02, -1.07938915e-01, ...,
1.61273435e-01, 1.35231227e-01, 1.42917693e-01]],
[[ 3.65034537e-03, 3.65046924e-03, -3.04184519e-02, ...,
-2.19731461e-02, -4.68177944e-02, -4.68180627e-02],
[ 3.65048903e-03, 3.64878308e-03, -3.04178949e-02, ...,
-2.19783746e-02, -4.68169525e-02, -4.68143001e-02],
[-3.38897994e-03, -3.38710798e-03, -3.42487320e-02, ...,
8.34795574e-05, -5.74416295e-03, -5.73810702e-03],
...,
[-9.23118368e-02, -9.23244432e-02, -1.17220066e-01, ...,
9.59281474e-02, 9.85913128e-02, 1.00409001e-01],
[-5.14872335e-02, -5.14793582e-02, -9.39545259e-02, ...,
1.61141366e-01, 1.11508891e-01, 1.15319595e-01],
[-5.14753163e-02, -5.14674783e-02, -9.39605534e-02, ...,
1.60376072e-01, 1.12537637e-01, 1.17749512e-01]]],
dtype=float32)
虽然数组的三维保留了,但是仅仅靠单点的高度判断整个空间的垂直速度是不靠谱的
那么有没有更加准确靠谱的函数索引wrf文件呢
什么年代了还在用np.where,既然是xarray格式。xarray库应该有对应的函数
果然有xr.where,一句话搞定
In [11]:
wa_filtered = wa.where((z >= height_min) & (z <= height_max),drop=True)
wa_filtered.shape
Out[11]:
(23, 437, 447)
In [12]:
wa_filtered
Out[12]:
如果想要保留筛选后的高度层,将drop参数去掉即可
In [14]:
wa_filtered = wa.where((z >= height_min) & (z <= height_max))
wa_filtered.shape
Out[14]:
(34, 437, 447)
截取3km到11km的垂直速度并绘制剖面图
In [22]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.cm import get_cmap
import cartopy.crs as crs
from cartopy.feature import NaturalEarthFeature
from netCDF4 import Dataset
from wrf import to_np, getvar, CoordPair, vertcross
# Open the NetCDF file
filename = "/home/mw/input/typhoon9537/wrfout_d01_2019-08-09_03_00_00"
ncfile = Dataset(filename)
# Extract the model height and wind speed
z = getvar(ncfile, "z")
wa = getvar(ncfile, "wa")
# 定义高度范围
height_min = 3000 # 3km
height_max = 11000 # 11km
wa_filtered = wa.where((z >= height_min) & (z <= height_max))
# Create the start point and end point for the cross section
start_point = CoordPair(lat=24, lon=120)
end_point = CoordPair(lat=32, lon=128)
# Compute the vertical cross-section interpolation. Also, include the
# lat/lon points along the cross-section.
wa_cross = vertcross(wa_filtered, z, wrfin=ncfile, start_point=start_point,
end_point=end_point, latlon=True, meta=True)
# Create the figure
fig = plt.figure(figsize=(12,8))
ax = plt.axes()
# Make the contour plot
wa_contours = ax.contourf(to_np(wa_cross), cmap=get_cmap("jet"))
# Add the color bar
plt.colorbar(wa_contours, ax=ax)
# Set the x-ticks to use latitude and longitude labels.
coord_pairs = to_np(wa_cross.coords["xy_loc"])
x_ticks = np.arange(coord_pairs.shape[0])
x_labels = [pair.latlon_str(fmt="{:.1f}, {:.1f}")
for pair in to_np(coord_pairs)]
ax.set_xticks(x_ticks[::20])
ax.set_xticklabels(x_labels[::20], rotation=45, fontsize=12)
# Set the y-ticks to be height.
vert_vals = to_np(wa_cross.coords["vertical"])
v_ticks = np.arange(vert_vals.shape[0])
ax.set_yticks(v_ticks[::20])
ax.set_yticklabels(vert_vals[::20], fontsize=12)
# Set the x-axis and y-axis labels
ax.set_xlabel("Latitude, Longitude", fontsize=12)
ax.set_ylabel("Height (m)", fontsize=12)
plt.title("Vertical Cross Section of wa(m/s))")
plt.show()
/tmp/ipykernel_54/3329102611.py:37: MatplotlibDeprecationWarning: The get_cmap function was deprecated in Matplotlib 3.7 and will be removed two minor releases later. Use ``matplotlib.colormaps[name]`` or ``matplotlib.colormaps.get_cmap(obj)`` instead.
wa_contours = ax.contourf(to_np(wa_cross), cmap=get_cmap("jet"))