Rain Rate Interpolation Algorithm

The program rgmin generates 1-minute hourly rain rates from discrete tipping bucket rain gauge data by applying an interpolation algorithm. The interpolating routine is based on the cubic spline routine published in the book "Numerical Recipes". The mathematical theory underlying this algorithm is presented in section 3.3 of the book along with a copy of the source code. The mathematics involved will not be described in any detail, rather the spline will be described in context with its application to the broader algorithm being applied in rgmin. Tipping bucket data received by the TRMM Satellite Validation Office can differ in two ways:

1) Data sampling
       i) number of tips in minute
       ii) time of tip (to nearest second)

2) Amount of rain per tip
       i) .254 mm (0.01 in )
       ii) 1 mm (0.039 in)

Input directive parameters are provided before each run which aid in controlling the processing flow. These parameters determine which module is used to read the data and how much rain is accumulated for each tip. In addition the input directives file supplies a file listing identifying the file and its directory location .

While the data is being read from file a time of tip in seconds is attached to each record, thus the Julian day, hour, minute and second are converted to seconds to give the actual time (within a given year) in seconds. This time is stored in a 1-dimensional integer array, and later passed back to the main program after all of the data has been read from file. In the case where the data sampled is recorded as number of tips per minute, the tips recorded during that minute are spread out in even time intervals, so that each individual tip has a unique time stamp associated with it. As the data is being read from the file, consecutive records are compared by computing the time difference between them. If this difference is between 20 and 30 minutes, it is assumed that the current tip was accumulated gradually over that time period, and therefore should be spread more evenly over that period. In this special case, a time halfway between the computed time difference is added as an additional record and a half-tip is recorded at the current time and the time half way in between . For example, if the time difference between two consecutive records is 22 minutes, an additional record would be added at 11 minutes, and a half tip would be assigned to each record at 11 and 22 minutes. Thus no additional rainfall is added, rather the actual recorded rainfall is spread out over the longer time interval so that the interpolation derived from it is more representative of the actual rain event.

The time difference between consecutive tips is also used to discriminate between independent rain events. In this case a rain event is defined as being a series of tips which are contiguous in time, and which have a distinct beginning and end (this definition should be distinguished from that which defines an event as a single tip). The beginning and end of a particular rain event is determined based on a threshold parameter. If the time difference between tips exceeds the value established by this parameter, the event is determined by the algorithm to have ended with the previous tip. In this case, the array index corresponding to the current time record is stored in an array. This array of indexes represents a listing of pointers or position markers which allow for a simple recall of the beginning and ending times corresponding to independent rain events. The recorded index actually represents the beginning of the next event, thus the index that precedes it represents the index of the event just completed. The threshold parameter therefore sets an upper limit which explicitly defines how much time can elapse between tips before the event is declared to be over. More importantly, from this determination beginning and ending boundaries can be computed which allow for a separate processing of each rain event independent of the others.

Once an entire data file has been read, the three arrays containing the times, the half tips, and the "event pointers" are passed to the main program. Up to this point no actual processing has been made other than to apply the half tip criteria and to discriminate between independent events. The main processing is subsequently carried out by the spline routine. The spline routine requires as input an x and y vector corresponding to the independent and dependent variables, respectively. It also requires three scalar values corresponding to the number of elements contained in x and y, and the first derivatives at the endpoints for the specified range . In this case time is taken to be the independent variable (x vector) and rainfall (mm) is taken to be the dependent variable (y vector). The number of elements, n, corresponds to the number of tips (including half tips) for a given event. This value is computed from the array of pointers with the following simple equation:

n=indx(i+1) -indx(i) -1
Where i is a place setter which is incremented by one after the processing of each event. Beginning with the first time in the event defined by indx(i) above, the times get loaded into the x vector by a loop running from 1 to n. It was found that the best results were obtained by computing x in relative time units. Relative time for a given event is defined as follows
x(i) = tiptime(current tip) - tiptime(1st tip in event)

Thus, x(1) =0 by definition. For each time stored in x a corresponding y value is stored based on the amount of rainfall recorded for each tip. In this case, y is a running sum representing the accumulated rainfall over a given rain event. As noted above, the factor which converts the tip to a rainfall amount is determined based on an input directives parameter . In the case of a half tip, only half of a tip is stored in y ( i.e. half tip = tip*factor/2 ).

After the x and y vectors have been completely loaded, a module is called which computes the endpoint derivatives based on an average of the data over the first five minutes of the event. The x and y vectors together with n and the two derivatives are passed to the first of two modules which perform the cubic spline interpolation. The first module computes the second derivative at each discrete point in the interval. This data object is also a vector of size n, and is required as input in the next spline module called. The second spline module actually performs the interpolation based on the previous spline inputs plus the computed vector of second derivatives.

The next step involves converting the relative time back to actual time. This only need be done for the first and last times in the interval, for the processed data consists of one minute rain rate averages. The first and last times therefore represent the beginning and ending indexes of a loop, incremented in units of 60 (i.e. do xval = btime, etime, 60). The one minute rain rates subsequently are computed within the loop at the beginning and end of the minute. While inside the loop the second spline module is called twice consecutively and returns the interpolated rainfall at the beginning and end of the minute (i.e. xval and xval+59). Thus, given an arbitrary scalar value for x within the given domain, the spline returns an interpolated scalar value for y. The difference in the two values at xval and xval +59 represent the accumulated rainfall over one minute, or rather, the one minute rain rate. To obtain the 1 hour rain rate, this value need be simply multiplied by a factor of 60.

As a means of quality controlling the data the assumption is made that the interpolated rainfall should closely approximate the measured rainfall if the spline is performing its task correctly. From previous research it has been shown that due to the noise and random behavior of natural rain, the spline occasionally returns negative values. Moreover, because a spline by nature is an interpolation, random biases can occur in the interpolated data . In order to account for these biases the data is passed to a separate module which computes an event dependent bias. The bias is computed by summing up the interpolated rainfall for a given event and comparing it against the measured rainfall based on the raw tip data. The accumulated measured rainfall is obtained from the last value of the y vector used in the spline routine ( i.e. y (n) ) whereas the interpolated rainfall is computed using the spline procedure described above.

This simple procedure allows for a direct and immediate comparison of the interpolated rainfall vs. the measured rainfall from which a bias factor can be computed. This bias factor is then passed back to the main program and used to compute the interpolated rain rate. The rain rate is computed based on the following equation:

rate = bias*(rain(1st sec of minute) - rain(2nd sec of min))*60


Two additional quality control measures are applied to the data . First only rain rates above a threshold of .10 mm/hr are accepted. This condition is applied both in computing the bias and before the data is written to output.

8/27/96 - Modifications to rgmin.f

The cubic spline routine cannot adequately perform an interpolation if less than three data points exist. This logistical limitation is due to the computation of the second derivative used in deriving the concavity of the curve to be fitted. Consequently, rain intensities associated with events consisting of one or two tips respectively, must be derived by an independent methodology. The logistical dilemma of one and two tip rain events relates to the uncertainty introduced regarding the time interval over which the rainfall occurs. In the extreme case of a single tip event, no real information exists as to the beginning and end of a rain event. All of the rainfall is essentially received with a single time stamp. For instance, it is well known that often times a rain event will terminate leaving a bucket partially filled, such that any future rainfall will cause the first tip in the next event to measure a full bucket when in fact this is not the case. Moreover, with no information regarding the time interval over which the precipitation is collected, any inference of a rain rate is subject to large absolute uncertainties. Although two tip events provide more information about the time over which the rain occurs, large uncertainties still prevail in concurrence with the arguments provided above. It should be pointed out though that the uncertainty involved pertains primarily to the determination of a rain intensity which is an empirically derived quantity as opposed to the rainfall accumulation which is a measured quantity.

The objective of the data analysis is to provide meaningful rain rate information, and so a dilemma arises regarding the treatment of one and two tip events. Since this data can also be used to ascertain rainfall accumulation over various time periods, it is of near equal importance that the measured rainfall accumulation be preserved in the interpolated data product. The TRMM Satellite Validation Office has attempted to account for this need by incorporating a separate treatment of one and two tip events into the above algorithm. This treatment is necessitated more by the need for accurate rainfall totals than for the need for the rain rate information. For this reason, the intensities generated are flagged with a negative sign in the manner described for multiple tip records, informing the user of the data that these rain rates are suspect and should probably only be used in calculations of rainfall accumulation.

Single tip events are treated simply by assuming both a rain rate and a time interval over which that rain rate persists. Both of these estimated parameters were determined through discussions with AIW participants and other TRMM personnel. Subsequently, a rain rate of 3.00 mm/hr is assumed. This rain rate is assumed to occur over a 5 minute interval beginning at a time 4 minutes prior to the recorded time stamp, such that the last intensity record corresponds to the time stamp of the recorded tip. Two tip events are handled by applying a linear interpolation. A rate is computed by dividing the total rainfall (2 tips ~ .5 mm) by the time interval between tips. This rate is then applied to this interval. The TRMM Satellite Validation Office discourages the use of these rain intensities in any kind rain rate analysis. These rates have been derived primarily to preserve the actual measured rainfall in the data product, for many users of the data are interested in deriving accurate rainfall estimates. This treatment of the data satisfies this need by not excluding any measured tip events.

03/2003 - Major software upgrade

The previous Fortran codes for gag and rgmin were replaced with new Perl codes, and then IDL scripts. CubicSpline-based interpolation algorithm was improved. Linear interpolation was employed whenever cubic spline interpolation failed to get the desired results.

Data users are referred to Wang et al. 2008 (J. Atmos. Oceanic Technol., 25, 43-56.) or contact Jianxin Wang (jianxin.wang@nasa.gov) for details on the 2A56 or GMIN interpolation algorithm and software.