GAMsetup {mgcv}R Documentation

Set up GAM using penalized cubic regression splines

Description

Sets up design matrix X, penalty matrices S_i and linear equality constraint matrix C for a GAM defined in terms of penalized regression splines, as well as returning the locations of the knots of these regression splines xp[][]. The output is such that the model can be fitted and smoothing parameters estimated by the method of Wood (2000) as implemented in routine mgcv(). This routine is largely superceded by gam.

Usage

GAMsetup(G)

Arguments

The function takes a single argument G, but this is a list containing several elements:

m the number of smooth terms in the model
n the number of data to be modelled
nsdf the number of user supplied columns of the design matrix for any parametric model parts
df an array of G$m integers specifying the maximum d.f. for each spline term.
x an array of G$n element arrays of data and (optionally) design matrix columns. The first G$nsdf elements of G$x should contain the elements of the columns of the design matrix corresponding to the parametric part of the model. The remaining G$m elements of G$x are the values of the covariates that are arguments of the spline terms. Note that the smooths will be centred and no intercept term will be added unless an array of 1's is supplied as part of in G$x

Value

A list H, containing the elements of G (the input list) plus the following:

X the full design matrix.
S an array of matrices containing the coefficients of the penalties. These are stored in a compact form, so that H$S[i] is the smallest square submatrix containing all the non-zero elements of S_i, the ith penalty matrix. Element 0,0 of H$S[i] is element off[i],off[i] of S_i, element 0,1 of H$S[i] is element off[i],off[i]+1 of S_i, and so on.
off is an array of offsets, used to facilitate efficient storage of the penalty matrices and to indicate where in the overall parameter vector the parameters of the ith spline reside (e.g. first parameter of ith spline is at p[off[i]]).
C a matrix defining the linear equality constraints on the parameters used to define the the model (i.e. C in Cp=0).
xp matrix whose rows contain the covariate values corresponding to the parameters of each spline - the splines are parameterized using their y- values at a series of x values - these vectors contain those x values!

Author(s)

Simon N. Wood snw@st-and.ac.uk

References

Wood, S.N. (2000) Modelling and smoothing parameter estimation with multiple quadratic penalties" JRSSB 62(2):413-428

See Also

mgcv gam

Examples

    # This example modified from routine SANtest()

    n<-100 # number of observations to simulate
    x <- runif(5 * n, 0, 1) # simulate covariates
    x <- array(x, dim = c(5, n)) # put into array for passing to GAMsetup
    pi <- asin(1) * 2  # begin simulating some data
    y <- 2 * sin(pi * x[2, ])
    y <- y + exp(2 * x[3, ]) - 3.75887
    y <- y + 0.2 * x[4, ]^11 * (10 * (1 - x[4, ]))^6 + 10 * (10 * 
        x[4, ])^3 * (1 - x[4, ])^10 - 1.396
    sig2<- -1    # set magnitude of variance 
    e <- rnorm(n, 0, sqrt(abs(sig2)))
    y <- y + e          # simulated data
    w <- matrix(1, n, 1) # weight matrix
    par(mfrow = c(2, 2)) # scatter plots of simulated data
    plot(x[2, ], y)
    plot(x[3, ], y)
    plot(x[4, ], y)
    plot(x[5, ], y)
    x[1,]<-1
    G <- list(m = 4, n = n, nsdf = 0, df = c(15, 15, 15, 15), 
        x = x) # creat list for passing to GAMsetup
    H <- GAMsetup(G)
    H$y <- y    # add data to H
    H$sig2 <- sig2  # add variance (signalling GCV use in this case) to H
    H$w <- w       # add weights to H
    H$sp<-array(-1,H$m)
    H$fix<-array(FALSE,H$m)
    H <- mgcv(H)  # select smoothing parameters and fit model