世界气象组织(WMO)今日发起了一项挑战赛,旨在提高次季节-季节(S2S)尺度的预报技巧,得到了多家机构组织的支持,并且奖金额度富有诚意!有兴趣的可以关注一下!
它将如何运作?Renkulab将托管所有的代码和脚本,训练和验证数据轻易地从欧洲天气云服务器中获得。所有的代码和结果都将在比赛结束后开放使用!(期待,
)
奖项
The World Meteorological Organization (WMO) is launching an open prize challenge to improve current forecasts of precipitation and temperature from today’s best computational fluid dynamical models 3 to 6 weeks into the future using Artificial Intelligence and/or Machine Learning techniques. The challenge is organised by the World Weather Research Programme (WWRP)/World Climate Research Programme (WCRP) Subseasonal-to-Seasonal Prediction Project (S2S Project), in collaboration with Swiss Data Science Center (SDSC) and European Centre for Medium-Range Weather Forecasts (ECMWF).
Improved sub-seasonal to seasonal (S2S) forecast skill would benefit multiple user sectors immensely, including water, energy, health, agriculture and disaster risk reduction. The creation of an extensive database of S2S model forecasts has provided a new opportunity to apply the latest developments in machine learning to improve S2S prediction of temperature and precipitation forecasts up to 6 weeks ahead, with focus on biweekly averaged conditions around the globe.
The competition will be implemented on the platform of Renkulab which hosts all the codes and scripts. The training and verification data will be easily accessible from the European Weather Cloud and relevant access scripts will be provided to the participants. All the codes and forecasts of the challenge will be made open access after the end of the competition.
This is the landing page of the competition presenting static information about the competition and a continously updating leaderboard. For code examples and how to contribute, please visit the contribution template repository renkulab.io.
Prizes are issued for the top three submissions beating the re-calibrated ECMWF benchmark:
The 3rd prize is reserved for the top submission from developing or least developed country or small island states as per the UN list (see table C, F, H p.166ff). If such a submissions is already among the top 2, any third submission will get the 3rd prize.
The objective of the competition is to improve week 3+4 and 5+6 subseasonal global probabilistic 2m temperature and total precipitation forecasts issued in the year 2020 by using Machine Learning/Artificial Intelligence.
The evaluation will be continuously performed by a scorer
bot on renkulab.io, following verification notebook. Submissions are evaluated on the Ranked Probability Score (RPS
) between the ML-based forecasts and ground truth CPC temperature and accumulated precipitation observations based on pre-computed observations-based terciles. This RPS
is compared to the re-calibrated real-time 2020 ECMWF forecasts into the Ranked Probability Skill Score (RPSS
).
RPS
is calculated with the open-source package xskillscore over all 2020 forecast_reference_time
s. For deterministic forecasts:
xs.rps(observations, deterministic_forecasts, category_edges=precomputed_tercile_edges, dim='forecast_reference_time')
For probabilistic forecasts:
xs.rps(observations, probabilistic_forecasts, category_edges=None, input_distributions='p', dim='forecast_reference_time')
See the xskillscore.rps
API for details.
def RPSS(rps_ML, rps_benchmark):
"""Ranked Probability Skill Score. Compares two RPS.
1: max
(0,1]: positive means ML better than benchmark
0: Equal performance
(0, -inf): positive means ML worse than benchmark
"""
return 1 - rps_ML / rps_benchmark # positive means ML better than ECMWF benchmark
The final RPSS
relevant for the prizes is calculated globally with spatial weighting and averaged over the two variables and two steps. For diagnostics, we host leaderboards for the two variables in three regions:
Please find more details in the verification notebook.
We expect submissions to cover all bi-weekly week 3-4 and week 5-6 forecasts issued in 2020, see timings. We expect one submission netcdf file for all 53 weekly forecasts issued in 2020. Submission have to be gridded on a global 1.5 degree grid.
Each submission is a netcdf file with the folloing dimension sizes and coordinates:
>>> # in xarray
>>> ML_forecasts.sizes
Frozen(SortedKeysDict({'forecast_reference_time': 53, 'latitude': 121, 'longitude': 240, 'lead_time': 2, 'category': 3}))
>>> ML_forecasts.coords # coordinates; time(lead_time, forecast_reference_time) is optional
Coordinates:
* latitude (latitude) float64 90.0 88.5 87.0 ... -88.5 -90.0
* longitude (longitude) float64 0.0 1.5 3.0 ... 357.0 358.5
* forecast_reference_time (forecast_reference_time) datetime64[ns] 2020-01...
* lead_time (lead_time) timedelta64[ns] 14 days 28 days
* category (category) <U11 '[0., 0.33)' '[0.33, 0.66)' '[0.66, 1.]'
time (lead_time, forecast_reference_time) datetime64[ns] 2...
A template file for submissions can soon be found here.
Such submissions need to be commited in git with git lfs
.
After the competition, the code for training must be made public, so the competition maintainers will check the requirements of data timing use. The prizes will be distributed for the top 3 requirements-complying contributions at the end of the competition. During the competition the organizers may ask top listed participants to provide access to their training pipeline. Please indicate the resources used (number of CPUs/GPUs, memory, platform; see examples) in your scripts/notebooks to allow reproducibility. Submissions which cannot independently reproduced cannot win prizes.
1) Which forecast starts/target periods (weeks 3-4 & 5-6) to require to be submitted?
Please find a list of the dates when forecasts are issued forecast_reference_time
and corresponding start and end in valid_time
for week 3-4 and week 5-6.
lead_time | week 3-4 start | week 3-4 end | week 5-6 start | week 5-6 end | |
---|---|---|---|---|---|
forecast_reference_time | |||||
valid_time | 2020-01-02 | 2020-01-16 | 2020-01-29 | 2020-01-30 | 2020-02-12 |
2020-01-09 | 2020-01-23 | 2020-02-05 | 2020-02-06 | 2020-02-19 | |
2020-01-16 | 2020-01-30 | 2020-02-12 | 2020-02-13 | 2020-02-26 | |
2020-01-23 | 2020-02-06 | 2020-02-19 | 2020-02-20 | 2020-03-04 | |
2020-01-30 | 2020-02-13 | 2020-02-26 | 2020-02-27 | 2020-03-11 | |
2020-02-06 | 2020-02-20 | 2020-03-04 | 2020-03-05 | 2020-03-18 | |
2020-02-13 | 2020-02-27 | 2020-03-11 | 2020-03-12 | 2020-03-25 | |
2020-02-20 | 2020-03-05 | 2020-03-18 | 2020-03-19 | 2020-04-01 | |
2020-02-27 | 2020-03-12 | 2020-03-25 | 2020-03-26 | 2020-04-08 | |
2020-03-05 | 2020-03-19 | 2020-04-01 | 2020-04-02 | 2020-04-15 | |
2020-03-12 | 2020-03-26 | 2020-04-08 | 2020-04-09 | 2020-04-22 | |
2020-03-19 | 2020-04-02 | 2020-04-15 | 2020-04-16 | 2020-04-29 | |
2020-03-26 | 2020-04-09 | 2020-04-22 | 2020-04-23 | 2020-05-06 | |
2020-04-02 | 2020-04-16 | 2020-04-29 | 2020-04-30 | 2020-05-13 | |
2020-04-09 | 2020-04-23 | 2020-05-06 | 2020-05-07 | 2020-05-20 | |
2020-04-16 | 2020-04-30 | 2020-05-13 | 2020-05-14 | 2020-05-27 | |
2020-04-23 | 2020-05-07 | 2020-05-20 | 2020-05-21 | 2020-06-03 | |
2020-04-30 | 2020-05-14 | 2020-05-27 | 2020-05-28 | 2020-06-10 | |
2020-05-07 | 2020-05-21 | 2020-06-03 | 2020-06-04 | 2020-06-17 | |
2020-05-14 | 2020-05-28 | 2020-06-10 | 2020-06-11 | 2020-06-24 | |
2020-05-21 | 2020-06-04 | 2020-06-17 | 2020-06-18 | 2020-07-01 | |
2020-05-28 | 2020-06-11 | 2020-06-24 | 2020-06-25 | 2020-07-08 | |
2020-06-04 | 2020-06-18 | 2020-07-01 | 2020-07-02 | 2020-07-15 | |
2020-06-11 | 2020-06-25 | 2020-07-08 | 2020-07-09 | 2020-07-22 | |
2020-06-18 | 2020-07-02 | 2020-07-15 | 2020-07-16 | 2020-07-29 | |
2020-06-25 | 2020-07-09 | 2020-07-22 | 2020-07-23 | 2020-08-05 | |
2020-07-02 | 2020-07-16 | 2020-07-29 | 2020-07-30 | 2020-08-12 | |
2020-07-09 | 2020-07-23 | 2020-08-05 | 2020-08-06 | 2020-08-19 | |
2020-07-16 | 2020-07-30 | 2020-08-12 | 2020-08-13 | 2020-08-26 | |
2020-07-23 | 2020-08-06 | 2020-08-19 | 2020-08-20 | 2020-09-02 | |
2020-07-30 | 2020-08-13 | 2020-08-26 | 2020-08-27 | 2020-09-09 | |
2020-08-06 | 2020-08-20 | 2020-09-02 | 2020-09-03 | 2020-09-16 | |
2020-08-13 | 2020-08-27 | 2020-09-09 | 2020-09-10 | 2020-09-23 | |
2020-08-20 | 2020-09-03 | 2020-09-16 | 2020-09-17 | 2020-09-30 | |
2020-08-27 | 2020-09-10 | 2020-09-23 | 2020-09-24 | 2020-10-07 | |
2020-09-03 | 2020-09-17 | 2020-09-30 | 2020-10-01 | 2020-10-14 | |
2020-09-10 | 2020-09-24 | 2020-10-07 | 2020-10-08 | 2020-10-21 | |
2020-09-17 | 2020-10-01 | 2020-10-14 | 2020-10-15 | 2020-10-28 | |
2020-09-24 | 2020-10-08 | 2020-10-21 | 2020-10-22 | 2020-11-04 | |
2020-10-01 | 2020-10-15 | 2020-10-28 | 2020-10-29 | 2020-11-11 | |
2020-10-08 | 2020-10-22 | 2020-11-04 | 2020-11-05 | 2020-11-18 | |
2020-10-15 | 2020-10-29 | 2020-11-11 | 2020-11-12 | 2020-11-25 | |
2020-10-22 | 2020-11-05 | 2020-11-18 | 2020-11-19 | 2020-12-02 | |
2020-10-29 | 2020-11-12 | 2020-11-25 | 2020-11-26 | 2020-12-09 | |
2020-11-05 | 2020-11-19 | 2020-12-02 | 2020-12-03 | 2020-12-16 | |
2020-11-12 | 2020-11-26 | 2020-12-09 | 2020-12-10 | 2020-12-23 | |
2020-11-19 | 2020-12-03 | 2020-12-16 | 2020-12-17 | 2020-12-30 | |
2020-11-26 | 2020-12-10 | 2020-12-23 | 2020-12-24 | 2021-01-06 | |
2020-12-03 | 2020-12-17 | 2020-12-30 | 2020-12-31 | 2021-01-13 | |
2020-12-10 | 2020-12-24 | 2021-01-06 | 2021-01-07 | 2021-01-20 | |
2020-12-17 | 2020-12-31 | 2021-01-13 | 2021-01-14 | 2021-01-27 | |
2020-12-24 | 2021-01-07 | 2021-01-20 | 2021-01-21 | 2021-02-03 | |
2020-12-31 | 2021-01-14 | 2021-01-27 | 2021-01-28 | 2021-02-10 |
2) Which data to “allow” to be used to make a specific ML forecast?
Main datasets for this competition are already available as renku datasets for both variables temperature and precipitation:
tag in climetlab | Description | renku dataset |
---|---|---|
forecast-benchmark | ECMWF week 3+4 & 5+6 re-calibrated real-time 2020 forecasts | missing |
observations | CPC daily observations interpolated on 1.5 degree grid | missing |
training-input | daily real-time initialized on thursdays 2020 forecasts from models ECMWF, ECCC, NCEP | missing |
forecast-input | daily reforecasts initialized once per week until 2019 from models ECMWF, ECCC, NCEP | missing |
tercile_edges | Observations-based tercile category_edges | missing |
We encourage to use subseasonal forecasts from the S2S and SubX projects:
However, any other publicly available data sources (like CMIP, NMME, etc.) of dates prior the forecast_reference_time can be used for training-input
and forecast-input
. Also purely empirical methods like persistence or climatology could be used. The only strong data requirement concerns time, see timings
Ground truth sources are CPC temperature and accumulated precipitation from IRIDL:
pr
: precipitation rate to accumulatet2m
: 2m temperatureIn progress…
Follow the steps in the template renku project.
Where to train?
renku clone
or git clone
your project onto your own laptop or supercomputer account for the heavy lifting
EWC
(where large parts of the data is stored) upon request. This opportunity is specifically targeted for participants from developing or least developed country or small island states and/or without institutional computing resources. Please get in touch with Aaron for access. Please note that we cannot make promises about these resources given the unknown demand.How to train?
We are looking for smart solutions here. Find a quick start here.
Please use the issue tracker in the renkulab s2s-ai-challenge
gitlab repository for discussions and questions to the organizers.
Answered questions from the issue tracker will be transferred to the FAQ.
The prizes will be awarded to the top three submission beating ECMWF re-calibrated benchmark and following the rules. The final score is the spatially weighted averaged [90N-60S] RPSS over both variables and both lead times.
group_name | score | timestamp | |
---|---|---|---|
0 | awesome group 4 | 0.0279 | 05/03/2021, 14:30:59 |
0 | awesome group | 0.0195 | 2021-04-01 09:51:55.083114 |
0 | awesome group | 0.019476 | 2021-04-01 09:15:01.142246 |
0 | awesome group | 0.019476 | 2021-04-01 09:16:40.951526 |
0 | awesome group | 0.019476 | 2021-04-01 09:49:42.806092 |
The following subleaderboards are purely diagnostic and show RPSS for two variables, two lead times and three subregions.
group_name | week 3-4 score | week 5-6 score | timestamp | |
---|---|---|---|---|
0 | awesome group 4 | 0.27 | 0.27 | 05/03/2021, 14:36:05 |
0 | awesome group 4 | 0.27 | 0.27 | 05/03/2021, 14:36:05 |
group_name | week 3-4 score | week 5-6 score | timestamp | |
---|---|---|---|---|
0 | awesome group 4 | 0.41 | 0.41 | 05/03/2021, 14:36:05 |
0 | awesome group 4 | 0.41 | 0.41 | 05/03/2021, 14:36:05 |
group_name | week 3-4 score | week 5-6 score | timestamp | |
---|---|---|---|---|
0 | awesome group 4 | 0.27 | 0.31 | 05/03/2021, 14:36:05 |
0 | awesome group 4 | 0.27 | 0.31 | 05/03/2021, 14:36:05 |
group_name | week 3-4 score | week 5-6 score | timestamp | |
---|---|---|---|---|
0 | awesome group 4 | -0.21 | -0.15 | 05/03/2021, 14:36:05 |
0 | awesome group 4 | -0.21 | -0.15 | 05/03/2021, 14:36:05 |
0 | awesome group 4 | -0.21 | -0.15 | 05/03/2021, 14:36:05 |
0 | awesome group 4 | -0.21 | -0.15 | 05/03/2021, 14:36:05 |
group_name | week 3-4 score | week 5-6 score | timestamp | |
---|---|---|---|---|
0 | awesome group 4 | -0.47 | -0.38 | 05/03/2021, 14:36:05 |
0 | awesome group 4 | -0.47 | -0.38 | 05/03/2021, 14:36:05 |
group_name | week 3-4 score | week 5-6 score | timestamp | |
---|---|---|---|---|
0 | awesome group 4 | -0.23 | -0.17 | 05/03/2021, 14:36:05 |
0 | awesome group 4 | -0.23 | -0.17 | 05/03/2021, 14:36:05 |