User`s guide

Smoothing Data

2-9

Smoothing Data

If your data is noisy, you might need to apply a smoothing algorithm to expose

its features, and to provide a reasonable starting approach for parametric

fitting. The two basic assumptions that underlie smoothing are

• The relationship between the response data and the predictor data is

smooth.

• The smoothing process results in a smoothed value that is a better estimate

of the original value because the noise has been reduced.

The smoothing process attempts to estimate the average of the distribution

of each response value. The estimation is based on a specified number of

neighboring response values.

You can think of smoothing as a local fit because a new response value is

created for each original response value. Therefore, smoothing is similar to

some of the nonparametric fit types supported by the toolbox, such as

smoothing spline and cubic interpolation. However, this type of fitting is not

the same as parametric fitting, which results in a global parameterization of

the data.

Note You should not fit data with a parametric model after smoothing,

because the act of smoothing invalidates the assumption that the errors are

normally distributed. Instead, you should consider smoothing to be a data

exploration technique.

There are two common types of smoothing methods: filtering (averaging) and

local regression. Each smoothing method requires a span. The span defines a

window of neighboring points to include in the smoothing calculation for each

data point. This window moves across the data set as the smoothed response

value is calculated for each predictor value. A large span increases the

smoothness but decreases the resolution of the smoothed data set, while a

small span decreases the smoothness but increases the resolution of the

smoothed data set. The optimal span value depends on your data set and the

smoothing method, and usually requires some experimentation to find.