Filling Gaps in the Measurement Records

Because the ARM program records actual measurements rather than estimated values, rare circumstances unavoidably arise when instrument or infrastructural failures create gaps in the temporal stream of measurements. Most gaps are short in duration and affect only one or a few related parameters. However, some failures, such as wide-area power outages or ice storms, occasionally affect nearly all recorded parameters at one or more ARM facilities simultaneously. One phase of this project seeks to statistically characterize the nature of the data gaps in various ARM parameters.

Even more rarely, the instruments and sensors may themselves become unstable or defective. Each ARM data file contains in its header some Quality Control information, usually provided by the instrument mentor, which establishes maxima and minima for the range limits of reasonable measurements. We apply these criteria, filtering out measurements exceeding the QC limits when preparing our customized statistical summaries generated for carbon modeling. Sensors may produce spurious measurement values outside the QC limits immediately prior to a complete failure mode. Such value Quality Control "trimming" is important for usability, but creates new data gaps and makes existing data gaps even larger.

Potential gap filling methods vary widely in terms of sophistication and complexity. Effort invested in filling gaps must be justified by evaluating whether carbon models are sensitive to or benefit from marginal improvements provided by more involved gap imputation methods.

Do various gap-filling methods make a difference?

The question above can be recognized as a variant of the one which we designed our "Make-a-Difference" (MAD) framework to answer. Essentially a customized analysis to test carbon model sensitivity, the MAD framework can also be used to evaluate and justify the necessity of using increasingly complex gap-filling procedures. When a MAD framework evaluation no longer demonstrates a difference, there is no need to invest additional intelligence or energy in imputation methods.

Of course, this sensitivity will vary with the selection of particular parameters, carbon models, output estimates, and imputation methods. Nevertheless, the existing MAD framework provides a "yardstick" which can be used for the evaluation of each specific combination of parameter, model, model prediction, and gap-filling method.

Imputation Methods

Imputation of missing values can be based on neighboring values through the dimension of time, the dimension of space, or a combination of both. Information from values which are present at the same site, but at other times can be used in making the estimate, or information from values which are present at other sites can be used. We also distinguish univariate methods, in which information is used only from the same parameter as is missing and being estimated, versus multivariate methods, which use information from parameters other than (or in addition to) the parameter which is missing. Because of their simplicity, our initial imputation approaches are univariate. We will build in complexity toward multivariate approaches as necessary and justified by the MAD framework tests.

This project is not the first to be faced with the problem of filling holes in a stream of measurements. The Ameriflux network of eddy-flux towers records measurements which are used in modeling and cannot contain measurement gaps. Eddy-flux correlation estimates are themselves based on a statistical model which, in turn, relies on a set of ancillary measurements. Missing data in primary or ancillary measurements prevent flux estimates from being made or used. The Ameriflux sites have developed an elaborate approach to gap-filling to address this problem. Their methods, however, are too specialized to be of general use here.

A Generic Approach to Imputation

We are developing a generalized univariate Univariate Generic Imputer Tool (UGIT) which is capable of automatically filling gaps in measurements by selecting the best method among a suite of standard statistical models based on time substitution, space substitution, or a combination of both time and space substitution. This tool, written in SAS, uses temporally lagged measurements from the previous day, week, and year, as well as the best available alternative site, and the average at all available sites to build a regression model for estimating missing values. Combination of the best available alternative site and the previous day is also considered. Whichever model produces the lowest RMSE among this suite of regression approaches is actually used to estimate the missing values.

The UGIT tool first assembles all available daily or hourly summarized data from all ARM facilities and Oklahoma MESONET sites. Day, week and year time-lagged values are calculated for each parameter at each site. The single best alternative spatial predictor site is also calculated for each parameter at each site. Then the tool calculates and constructs a data set of each time-alone, space-alone, and time-space regression model, including the R2 value and RMSE. Finally, UGIT uses the regression table to select the best model, patches the gaps, and adds a flag to the data set indicating which model was actually used to estimate each imputed data value. The regression choice depends not only on the performance of the model, but on the presence of the data values required for each type of regression. Thus, if a gap resulted from a wide-area outage that lasted for two days, values from the best alternative site or from the previous day may not be available, precluding the use of those types of regressions for filling that data gap.

We used the UGIT tool to fill gaps for several parameters at all ARM SGP facilities and Oklahoma Mesonet sites. Shortwave radiation was measured at the greatest number of sites (53), while longwave radiation was measured at the fewest (21). Most parameters were measured at 46 sites. The UGIT imputation tool produced as output a continuous record of all variables imputed at all sites.

The table below shows the relative counts of which models proved best and were actually used for imputing missing values in hourly summaries of the listed parameters:

Frequency distribution of best imputation method (temporal, spatial, or both) used to fill measurement gaps in selected ARM variables used as input for carbon simulations.
Imputed Parameter Temporal Imputation Spatial Imputation Time+Space
Imputation
Number of Gaps
Row Percent
Prev Day Prev Week Prev Year Best Facility

Spatial Avg Facility+Day Total
precipitation 0
0.00
72
10.86
4
0.60
512
77.22
5
0.75
70
10.56
663
windspeed 31
4.68
0
0.00
6
0.91
276
41.69
5
0.76
344
51.96
662
barometric pressure 21
2.58
3
0.37
1
0.12
218
26.75
3
0.37
569
69.82
815
temperature 32
3.88
7
0.85
1
0.12
209
25.36
2
0.24
573
69.54
824
relative humidity 30
2.92
7
0.68
0
0.00
206
20.02
3
0.37
783
76.09
1029
vapor pressure deficit 120
3.36
18
0.50
0
0.00
1183
33.13
4
0.11
2246
62.90
3571
longwave radiation 707
22.75
196
6.31
25
0.80
770
24.77
151
4.86
1259
40.51
3108
shortwave radiation 149
4.70
0
0.00
0
0.00
1159
36.58
55
1.74
1805
56.98
3168
Total 1090 303 37 4533 228 7649 13840

The hybrid Time + Space model of single best other facility plus Previous day's value was usually the best predictor (i.e., lowest RMSE) for most parameters. Longwave radiation is not recorded at the MESONET sites, so the number of hours that must be imputed is much larger for this parameter. Longwave radiation and precipitation measurements showed more use of the time-only model.


The following two tables show the strength of the predictive relationship using each of the standard types of regressions at a "typical" MESONET site and a "typical" ARM facility:

R2 and (RMSE) of relationship for a "typical" Oklahoma MESONET Site (BLAS)
Variable Best site + Prev day Best site Previous day Previous week Previous year All sites avg
precip 0.27 (0.09) 0.27 (0.09) 0.00 (0.14) 0.00 (0.14) 0.00 (0.15) 0.00 (0.14)
temp 0.99 (1.0) 0.99 (1.0) 0.83 (4.6) 0.62 (6.9) 0.59 (7.0) 0.58 (7.2)
vpd 0.96 (0.17) 0.96 (0.20) 0.72 (0.48) 0.48 (0.65) 0.36 (0.72) 0.37 (0.72)
windspd 0.79 (1.1) 0.78 (1.1) 0.08 (2.3) 0.02 (2.4) 0.02 (2.3) 0.03 (2.4)
swrad 0.97 (41.2) 0.97 (41.8) 0.75 (132.6) 0.57 (153.7) 0.66 (152.3) 0.74 (136.8)
lwrad 0.84 (19.2) 0.83 (19.8) 0.59 (39.8) 0.43 (47.1) 0.37 (43.3) 0.40 (48.1)

R2 and (RMSE) of relationship for a "typical" ARM facility (E13 Central Facility)
Variable Best site + Prev day Best site Previous day Previous week Previous year All sites avg
precip 0.39 (0.07) 0.40 (0.07) 0.00 (0.10) 0.00 (0.10) 0.00 (0.10) 0.00 (0.10)
temp 0.99 (1.0) 0.99 (1.0) 0.84 (4.54) 0.62 (6.88) 0.59 (7.10) 0.58 (7.24)
vpd 0.96 (0.18) 0.96 (0.18) 0.71 (0.49) 0.46 (0.67) 0.34 (0.75) 0.36 (0.73)
windspd 0.81 (1.17) 0.81 (1.17) 0.06 (2.63) 0.01 (2.71) 0.01 (2.64) 0.02 (2.70)
swrad 0.74 (48.2) 0.66 (55.1) 0.51 (68.5) 0.45 (72.9) 0.42 (75.2) 0.57 (64.6)
lwrad 0.91 (13.9) 0.90 (13.9) 0.62 (40.5) 0.47 (47.8) 0.49 (45.5) 0.45 (48.5)

Parameters differed widely in their predictability. Temperature was easiest to predict; all models did a reasonable job predicting this temporally and spatially correlated variable. Vapor pressure deficit was also relatively easy to predict. Precipitation was an imputation challenge, ostensibly because it is highly variable, both temporally and spatially. Windspeed was intermediate between these extremes.

Our project prepared and presented a poster on Characterizing and Filling Data Gaps in ARM Measurements for Carbon Models at the AGU meeting in December 2003.

The "Hole-Punching" Experiment

We are planning to perform an experiment to test the effectiveness of data imputation methods with respect to output from selected carbon models. We will start with a complete, unbroken data record of a number of measured ARM parameters. From this complete historical record, we will deliberately remove measurements for certain periods of time, creating artificial gaps of various types. We will retain the complete record of actual measurements with which we started for comparison.

There will be four "factors" in the experimental design of the "Hole-Punching" experiment:

We will compare the original, unbroken data record with the interrupted and gap-filled data stream using our MAD framework, relative to the output of several carbon models. We will likely have to restrict the investigated levels of each of these factors to prevent the size of the Hole-Punching experiment from becoming unwieldy.

After a thorough examination of univariate methods, we plan to undertake (if warranted for particular "difficult" parameters like precipitation, according to the MAD framework evaluation) an examination of multivariate imputation methods. One of the strengths of the ARM data set is that so many different parameters are measured. By utilizing information present in these multiple cotemporaneous parameters, we should be able to improve substantially on the accuracy of estimated missing data values. Multivariate methods, however, will be substantially more complex than univariate approaches.

Spatial interpolation of measured ARM values across areas can be viewed as a special form of imputation. Experience gained in the gap-filling portion of this project should provide insights into the eventual extrapolation of point measurements taken at the spatial constellation of ARM facilites into fully-populated continuous grids of estimated values suitable for driving carbon models operating in regional grid mode.




William W. Hargrove (hnw@fire.esd.ornl.gov)
Last Modified: Fri Jul 2 13:38:04 EDT 2004