glmboost {mboost} | R Documentation |
Gradient boosting for optimizing arbitrary loss functions where component-wise linear models are utilized as base-learners.
## S3 method for class 'formula' glmboost(formula, data = list(), weights = NULL, na.action = na.pass, contrasts.arg = NULL, center = TRUE, control = boost_control(), ...) ## S3 method for class 'matrix' glmboost(x, y, center = TRUE, control = boost_control(), ...) ## Default S3 method: glmboost(x, ...) ## S3 method for class 'glmboost' plot(x, main = deparse(x$call), col = NULL, off2int = FALSE, ...)
formula |
a symbolic description of the model to be fit. |
data |
a data frame containing the variables in the model. |
weights |
an optional vector of weights to be used in the fitting process. |
contrasts.arg |
a list, whose entries are contrasts suitable for input
to the |
na.action |
a function which indicates what should happen when the data
contain |
center |
logical indicating of the predictor variables are centered before fitting. |
control |
a list of parameters controlling the algorithm. |
x |
design matrix or an object of class |
y |
vector of responses. |
main |
a title for the plot. |
col |
(a vector of) colors for plotting the lines representing the coefficient paths. |
off2int |
logical indicating whether the offset should be added to the intercept (if there is any) or if the offset is neglected for plotting (default). |
... |
additional arguments passed to |
A (generalized) linear model is fitted using a boosting algorithm based on component-wise univariate linear models. The fit, i.e., the regression coefficients, can be interpreted in the usual way. The methodology is described in Buehlmann and Yu (2003), Buehlmann (2006), and Buehlmann and Hothorn (2007).
An object of class glmboost
with print
, coef
,
AIC
and predict
methods being available.
For inputs with longer variable names, you might want to change
par("mai")
before calling the plot
method of glmboost
objects visualizing the coefficients path.
Peter Buehlmann and Bin Yu (2003), Boosting with the L2 loss: regression and classification. Journal of the American Statistical Association, 98, 324–339.
Peter Buehlmann (2006), Boosting for high-dimensional linear models. The Annals of Statistics, 34(2), 559–583.
Peter Buehlmann and Torsten Hothorn (2007), Boosting algorithms: regularization, prediction and model fitting. Statistical Science, 22(4), 477–505.
Torsten Hothorn, Peter Buehlmann, Thomas Kneib, Mattthias Schmid and Benjamin Hofner (2010), Model-based Boosting 2.0. Journal of Machine Learning Research, 11, 2109 – 2113.
mboost
for the generic boosting function and
gamboost
for boosted additive models and
blackboost
for boosted trees. See cvrisk
for
cross-validated stopping iteration. Furthermore see
boost_control
, Family
and methods
### a simple two-dimensional example: cars data cars.gb <- glmboost(dist ~ speed, data = cars, control = boost_control(mstop = 5000), center = FALSE) cars.gb ### coefficients should coincide coef(cars.gb) + c(cars.gb$offset, 0) coef(lm(dist ~ speed, data = cars)) ### plot fit layout(matrix(1:2, ncol = 2)) plot(dist ~ speed, data = cars) lines(cars$speed, predict(cars.gb), col = "red") ### now we center the design matrix for ### much quicker "convergence" cars.gb_centered <- glmboost(dist ~ speed, data = cars, control = boost_control(mstop = 2000), center = TRUE) par(mfrow=c(1,2)) plot(cars.gb, main="without centering") plot(cars.gb_centered, main="with centering") ### alternative loss function: absolute loss cars.gbl <- glmboost(dist ~ speed, data = cars, control = boost_control(mstop = 5000), family = Laplace()) cars.gbl coef(cars.gbl) + c(cars.gbl$offset, 0) lines(cars$speed, predict(cars.gbl), col = "green") ### Huber loss with adaptive choice of delta cars.gbh <- glmboost(dist ~ speed, data = cars, control = boost_control(mstop = 5000), family = Huber()) lines(cars$speed, predict(cars.gbh), col = "blue") legend("topleft", col = c("red", "green", "blue"), lty = 1, legend = c("Gaussian", "Laplace", "Huber"), bty = "n") ### plot coefficient path of glmboost par(mai = par("mai") * c(1, 1, 1, 2.5)) plot(cars.gb) ### sparse high-dimensional example library("Matrix") n <- 100 p <- 10000 ptrue <- 10 X <- Matrix(0, nrow = n, ncol = p) X[sample(1:(n * p), floor(n * p / 20))] <- runif(floor(n * p / 20)) beta <- numeric(p) beta[sample(1:p, ptrue)] <- 10 y <- drop(X %*% beta + rnorm(n, sd = 0.1)) mod <- glmboost(y = y, x = X, center = TRUE) ### mstop needs tuning coef(mod, which = which(beta > 0))