RobustFit is a free utility programme for robust locally-weighted scatterplot smoothing (LOWESS), or robustly fitting data to a polynomial function.
The image below shows the programme robustly smoothing a set of data points with many large outliers (which extend considerably above the top scale of the graph).
For full details on using the program, press F1 after starting the program.
A standard way of smoothing data to reduce noise is to apply a moving average (MA) filter. The MA filter works by replacing each data point by the average of itself and a number of data points within a user-determined window on either side of the point in question. However, MA filtering suffers from 2 disadvantages. First, it distorts genuine peaks in the data. Since data are averaged over the window, peaks within the window are inevitably "pulled down" towards the mean value of the data within the window. If the peaks are produced by noise then this is exactly what is wanted, but if the peaks are genuine, then it is bad news. The second problem is the mirror image of this; large noise outliers distort the smoothed data by pulling it in their direction. The LOWESS smoothing technique reduces both these problems.
The term LOWESS stands for "locally-weighted scatterplot smoothing", and is a technique developed by William S. Cleveland (Robust locally weighted regression and smoothing scatterplots. J. Am Stat Assoc, 1979, 74; 829-836). LOWESS is an extension of weighted local polynomial (LP) smoothing, which is itself an extension of the moving average (MA) smoothing mentioned above.
Weighted local polynomial fitting works by taking a window subset of the data, and passing this window successively through the entire dataset, moving along one data point at a time. At each stage a polynomial function (usually 1st or 2nd order) is fitted to the data in the window. However, the data do not contribute equally to the fit, but instead the central point has the largest weighting, and the points towards the edge of the window have successively less influence on the fit. When the fit is complete, the central data point is replaced by the value of the polynomial at that point in the window. The advantage of this is that the smoothed line follows genuine peaks and troughs in the original data better than it does with the MA procedure.
Robust weighted local polynomial fitting takes this process a stage further. Once the initial polynomial fit to the data within a window has been achieved, each data point is assigned a weight inversely proportional to the difference between its original raw value and its fitted value; this difference is the residual. If the residual is large, it suggests that the raw data value might be an outlier. So large residuals result in small weights, and vice versa. This residual-specific weight is combined with the location-specific weight used in the initial non-robust polynomial fit, and a further polynomial fit is performed. This second stage of fitting can be repeated until some convergence criterion is met, or until the user runs out of patience.
The image below shows the programme output after the initial weighted LP fit, but before the robust iterations (remember, there are many outliers above the top of the scale). Comparison with the image above clearly shows the importance of the robust aspect of LOWESS.
A more detailed description of LOWESS is given in a paper by Hen et al. (Hen, I, Sakov, A. Kafkafi, N. Golani, I. & Benjamini, Y, The dynamics of spatial behavior: how can robust smoothing techniques help? J. Neurosci. Methods 2004, 133; 161-172). I found the algorithm described in that paper very useful in developing the RobustFit programme.
Robust polynomial fitting does not use a moving window, but fits the entire data set to a polynomial function, without any location-specific weighting. However, like the LOWESS procedure, the intial fit is followed by further iterations of fitting, these times with residual-specific weights as described above.
RobustFit is free software. To obtain the program download RobustFit.zip. Right-click it and select Open, and then run the file setup.exe. Please note that RobustFit comes with NO GUARANTEES WHATSOEVER, and although every effort has been made to ensure that it works correctly, you use it at your own risk.
If you make use of RobustFit, or have any comments on it, I would be glad to hear by e-mail.