Sources of the notes for this lecture are from Ecological Detective (Chapter 5).
-
Simplest technique for the confrontation between models and data is sum of squares
- It is simple and makes few assumptions
- Long and successful history in science
- Computers can do remarkable calcualations associated with sum of squares
Basic method
Consider a simple model: where \(W_i\) is process uncertainty, and A, B, and C are parameters.
- For variables \(X_1, X_2, …., X_n\) we can generate predictions for Y with potential values for the parameters A, B, and C
- We can measure the deviation between the \(i^{th}\) predicted value and the \(i^{th}\) observed value: \( (Y_{pred,i} - Y_{obs,i})^2 \)
- We then sum the squared deviations to obtain a measure of fit between model and the data
- The best model (i.e., values for A, B, and C) will have the lowest sum of squares
Basic approach: Psuedocode 5.1
- Input the data and generate a range of potential values for A, B, and C
- Potential values should go from a minimum value to a maximum value by set increments
- Starting at the minimum values of the parameters generate a prediction of Y for each value of X. Calculate the sum of squares
- Compare sum of squares to the current lowest value of sum of squares, if it is less than the lowest value of sum of squares, then replace the current lowest sum of squares with the new one and the parameter values associated with the lowered sum of squares.
- Keep going until the maximum values of the parameters have been reached.
Psuedocode 5.2
- Specify values of the parameters A, E, and C, the number of data points to be generated, and the distribution of the process uncertainty. Set i = 1
- Choose Xi (e.g., by systematic choice of the independent variable X).
- Choose a particular value Wi of the process uncertainty W;.
- Determine Yi according to Yi= A + EXi + ex? + Wi
- Increase i by 1. If this is less than the number of data points to be generated, return to Step 2. Otherwise, stop.
Sum of squares
We can examine the relationship between the predicted and observed values.
Goodness of fit
- Helpful to consider how sensitive the fit of the model and the data is to variation in the parameters.
- Tells us how the sum of squares behaves if one of the parameters (the one that we systematically vary) is known
- Tells us how sensitive the parameters are to one another
- Provides some notion of confidence in our estimate of the parameter