GAMsetup {mgcv} | R Documentation |
Sets up design matrix X, penalty matrices S_i and linear equality constraint matrix C for a GAM defined in terms of
penalized regression splines, as well as returning the locations of the knots of these
regression splines xp[][]
. The output is such that the model can be fitted and
smoothing parameters estimated by the method of Wood (2000) as implemented in routine
mgcv()
. This routine is largely superceded by gam
.
GAMsetup(G)
The function takes a single argument G
, but this is a list containing several
elements:
m |
the number of smooth terms in the model |
n |
the number of data to be modelled |
nsdf |
the number of user supplied columns of the design matrix for any parametric model parts |
df |
an array of G$m integers specifying the maximum d.f. for each spline
term. |
x |
an array of G$n element arrays of data and (optionally) design matrix
columns. The first G$nsdf elements of G$x should contain the elements of
the columns of the design matrix corresponding to the parametric part of the model. The
remaining G$m elements of G$x are the values of the covariates that are
arguments of the spline terms. Note that the smooths will be centred and no intercept term
will be added unless an array of 1's is supplied as part of in G$x |
A list H
, containing the elements of G
(the input list) plus the
following:
X |
the full design matrix. |
S |
an array of matrices containing the coefficients of the penalties. These are
stored in a compact form, so that H$S[i] is the smallest square submatrix
containing all the non-zero elements of S_i, the ith penalty
matrix. Element 0,0 of H$S[i] is element off[i],off[i] of S_i, element 0,1 of H$S[i] is element off[i],off[i]+1 of S_i, and so on. |
off |
is an array of offsets, used to facilitate efficient storage of the penalty
matrices and to indicate where in the overall parameter vector the parameters of the ith
spline reside (e.g. first parameter of ith spline is at p[off[i]] ). |
C |
a matrix defining the linear equality constraints on the parameters used to define the the model (i.e. C in Cp=0). |
xp |
matrix whose rows contain the covariate values corresponding to the parameters of each spline - the splines are parameterized using their y- values at a series of x values - these vectors contain those x values! |
Simon N. Wood snw@st-and.ac.uk
Wood, S.N. (2000) Modelling and smoothing parameter estimation with multiple quadratic penalties" JRSSB 62(2):413-428
# This example modified from routine SANtest() n<-100 # number of observations to simulate x <- runif(5 * n, 0, 1) # simulate covariates x <- array(x, dim = c(5, n)) # put into array for passing to GAMsetup pi <- asin(1) * 2 # begin simulating some data y <- 2 * sin(pi * x[2, ]) y <- y + exp(2 * x[3, ]) - 3.75887 y <- y + 0.2 * x[4, ]^11 * (10 * (1 - x[4, ]))^6 + 10 * (10 * x[4, ])^3 * (1 - x[4, ])^10 - 1.396 sig2<- -1 # set magnitude of variance e <- rnorm(n, 0, sqrt(abs(sig2))) y <- y + e # simulated data w <- matrix(1, n, 1) # weight matrix par(mfrow = c(2, 2)) # scatter plots of simulated data plot(x[2, ], y) plot(x[3, ], y) plot(x[4, ], y) plot(x[5, ], y) x[1,]<-1 G <- list(m = 4, n = n, nsdf = 0, df = c(15, 15, 15, 15), x = x) # creat list for passing to GAMsetup H <- GAMsetup(G) H$y <- y # add data to H H$sig2 <- sig2 # add variance (signalling GCV use in this case) to H H$w <- w # add weights to H H$sp<-array(-1,H$m) H$fix<-array(FALSE,H$m) H <- mgcv(H) # select smoothing parameters and fit model