In this document I will explore how to create the first part of the evaluation system I proposed. The working title of this is the “Forecast-Hour Evaluation.” The idea here is that we are looking at the performance of the model by looking at how it performed with different start times (using the most recent 00-hr forecast as input).
For this evaluation system we need to look at three different output folders. Here we use the folders named, forecast_day_minus_0
, forecast_day_minus_1
, forecast_day_minus_2
. The contents of each of these folders will be similar: wrfout files for 86 forecast hours and time-series data for different locations of interest. Here we will first read the forecast data.
Now we will read the observation data from the ASOS stations. The script that downloads the data is in ./obs_station_day_minus_0/dl_ny_asos.py
. The lines for the dates to download need to be changed before running it. Once the files are download, the lines below reads the data and adds column names.
## Note: Using an external vector in selections is ambiguous.
## ℹ Use `all_of(select_columns)` instead of `select_columns` to silence this message.
## ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This message is displayed once per session.
Model and observation data do not share the same units for the same variable. For temperature, WRF is in Kelvin and ASOS is in degreesF. For winds, WRF is in m/s and ASOS is in knots. The formulas used to convert the numbers to a common system is shown here. For temperature I will use Kelvin, and m/s for wind speeds.
Now we have one data frame for all the observations, and three (3) data frames of the WRF data (one data frame per forecast init time). The lines below provide a visual of the data frames.
## Date.Time year mon day hour min sec Temperature Mixing.Ratio
## 90 2020-03-02 00:00:05 2020 3 2 0 0 5.0004 278.6126 0.00272
## 2 2020-03-02 00:00:10 2020 3 2 0 0 10.0008 278.6269 0.00272
## 3 2020-03-02 00:00:15 2020 3 2 0 0 15.0012 278.6368 0.00272
## 4 2020-03-02 00:00:20 2020 3 2 0 0 20.0016 278.6466 0.00272
## 5 2020-03-02 00:00:24 2020 3 2 0 0 24.9984 278.6567 0.00273
## 6 2020-03-02 00:00:29 2020 3 2 0 0 29.9988 278.6671 0.00273
## U_WIND V_WIND Wind.Speed Wind.Direction Station
## 1 2.20330 -0.56591 2.274815 284.4048 JFK
## 2 2.19619 -0.56805 2.268465 284.5019 JFK
## 3 2.17427 -0.55716 2.244522 284.3729 JFK
## 4 2.15956 -0.55426 2.229552 284.3945 JFK
## 5 2.14220 -0.55204 2.212186 284.4506 JFK
## 6 2.12870 -0.54625 2.197670 284.3922 JFK
## Date.Time year mon day hour min sec Temperature Mixing.Ratio
## 1 2020-03-01 00:00:05 2020 3 1 0 0 5.0004 271.9813 0.0017
## 2 2020-03-01 00:00:10 2020 3 1 0 0 10.0008 272.0778 0.0017
## 3 2020-03-01 00:00:15 2020 3 1 0 0 15.0012 272.1663 0.0017
## 4 2020-03-01 00:00:20 2020 3 1 0 0 20.0016 272.2466 0.0017
## 5 2020-03-01 00:00:24 2020 3 1 0 0 24.9984 272.3199 0.0017
## 6 2020-03-01 00:00:29 2020 3 1 0 0 29.9988 272.3868 0.0017
## U_WIND V_WIND Wind.Speed Wind.Direction Station
## 1 4.56979 -4.92495 6.718490 317.1422 JFK
## 2 4.45795 -4.72146 6.493497 316.6443 JFK
## 3 4.37077 -4.55255 6.311049 316.1670 JFK
## 4 4.28755 -4.42281 6.159897 315.8897 JFK
## 5 4.21225 -4.31989 6.033614 315.7228 JFK
## 6 4.14124 -4.22575 5.916657 315.5787 JFK
## Date.Time year mon day hour min sec Temperature Mixing.Ratio
## 1 2020-02-29 00:00:05 2020 2 29 0 0 5.0004 274.9658 0.00217
## 2 2020-02-29 00:00:10 2020 2 29 0 0 10.0008 275.0897 0.00217
## 3 2020-02-29 00:00:15 2020 2 29 0 0 15.0012 275.1934 0.00217
## 4 2020-02-29 00:00:20 2020 2 29 0 0 20.0016 275.2825 0.00216
## 5 2020-02-29 00:00:24 2020 2 29 0 0 24.9984 275.3604 0.00216
## 6 2020-02-29 00:00:29 2020 2 29 0 0 29.9988 275.4306 0.00216
## U_WIND V_WIND Wind.Speed Wind.Direction Station
## 1 10.80950 -0.60489 10.826411 273.2029 JFK
## 2 10.30843 -0.60953 10.326435 273.3839 JFK
## 3 9.95932 -0.60359 9.977594 273.4682 JFK
## 4 9.67320 -0.60363 9.692016 273.5708 JFK
## 5 9.40746 -0.60228 9.426720 273.6632 JFK
## 6 9.17261 -0.59789 9.192075 273.7294 JFK
## Station Date.Time Temperature Relative.Humidity Wind.Direction
## 1 JFK 2020-03-01 00:00:00 NaN NaN 320
## 2 JFK 2020-03-01 00:05:00 NaN NaN 320
## 3 JFK 2020-03-01 00:10:00 NaN NaN 320
## 4 JFK 2020-03-01 00:15:00 NaN NaN NaN
## 5 JFK 2020-03-01 00:20:00 NaN NaN 310
## 6 JFK 2020-03-01 00:25:00 NaN NaN 320
## Wind.Speed year mon day hour min sec
## 1 9.259259 2020 3 1 0 0 0
## 2 8.230453 2020 3 1 0 5 0
## 3 9.773663 2020 3 1 0 10 0
## 4 10.288066 2020 3 1 0 15 0
## 5 8.230453 2020 3 1 0 20 0
## 6 10.288066 2020 3 1 0 25 0
Time-matching is performed using a routine that can be found in Analysis01-Time_Matching_Problem.Rmd
. The time matching will be done per variable. For the Forecast-Hour Evaluation product, we will focus on the temperature, wind speed and wind direction variables. Also, now that we have read all the TS data and ASOS data, we need to extract the day of interest, or doi
for the time-series.
Note that for this product the “day of interest” will always be the UTC date of the day before.
We now have filtered data frames for the observations and model data for the day of interest.
Next, we will select only the temperature data for comparing the model and observations. This needs to be done on a per station basis. Note that we use the function drop_na()
to drop rows which contain NaN or NA data. Since each variable is measured at different intervals, not all variables will have data available at every time step in the ASOS data. The functions may be too sensitive to missing data and thus we take care to remvove it here from the observations, after we have isolated a particular variable.
For the temperature data I will use Bias, RMSE and MAE for the comparison statistics
Forecast.Init | BIAS | RMSE | MAE | |
1 | WRF D-0 | 0.325 | 1.273 | 0.935 |
2 | WRF D-1 | -0.413 | 0.944 | 0.783 |
3 | WRF D-2 | -7.730 | 7.958 | 7.730 |
Forecast.Init | BIAS | RMSE | MAE | |
1 | WRF D-0 | -2.869 | 3.429 | 2.909 |
2 | WRF D-1 | -2.903 | 3.467 | 2.986 |
3 | WRF D-2 | -2.206 | 3.944 | 2.767 |
Forecast.Init | RMSE | MAE | |
1 | WRF D-0 | 26.299 | 18.556 |
2 | WRF D-1 | 61.804 | 38.299 |
3 | WRF D-2 | 106.622 | 103.003 |
Forecast.Init | BIAS | RMSE | MAE | |
1 | WRF D-0 | 0.430 | 1.018 | 0.822 |
2 | WRF D-1 | -0.918 | 1.398 | 1.219 |
3 | WRF D-2 | -9.322 | 9.438 | 9.322 |
Forecast.Init | BIAS | RMSE | MAE | |
1 | WRF D-0 | -2.355 | 2.567 | 2.379 |
2 | WRF D-1 | -2.368 | 2.612 | 2.385 |
3 | WRF D-2 | -1.774 | 2.322 | 1.891 |
Forecast.Init | RMSE | MAE | |
1 | WRF D-0 | 35.078 | 28.444 |
2 | WRF D-1 | 56.790 | 37.865 |
3 | WRF D-2 | 112.199 | 107.557 |
Forecast.Init | BIAS | RMSE | MAE | |
1 | WRF D-0 | 0.468 | 1.209 | 0.894 |
2 | WRF D-1 | -0.675 | 1.328 | 1.071 |
3 | WRF D-2 | -9.641 | 9.706 | 9.641 |
Forecast.Init | BIAS | RMSE | MAE | |
1 | WRF D-0 | -1.820 | 2.272 | 1.990 |
2 | WRF D-1 | -1.809 | 2.260 | 1.953 |
3 | WRF D-2 | -1.748 | 2.231 | 1.970 |
Forecast.Init | RMSE | MAE | |
1 | WRF D-0 | 66.413 | 53.435 |
2 | WRF D-1 | 58.055 | 49.541 |
3 | WRF D-2 | 111.906 | 101.054 |
## Warning: Removed 41 rows containing missing values (geom_point).
## Warning: Removed 13 rows containing missing values (geom_path).