impute.rsf {randomSurvivalForest}R Documentation

Impute Only Mode

Description

Imputation for right censored survival and competing risk data. A random survival forest is grown and used to impute missing data. No ensemble estimates or error rates are calculated. This is a fast way to impute data.

Usage

impute.rsf(formula, data = NULL, ntree = 1000, mtry = NULL,
    nodesize = NULL, splitrule = NULL, nsplit = 0, big.data = FALSE,
    nimpute = 1, predictorWt = NULL, seed = NULL, do.trace = FALSE,
    ...)

Arguments

formula

A symbolic description of the model to be fit.

data

Data frame containing the data to be imputed.

ntree

Number of trees to grow.

mtry

Number of variables randomly sampled at each split.

nodesize

Minimum terminal node size.

splitrule

Splitting rule used to grow trees.

nsplit

Non-negative integer value used to specify random splitting.

big.data

Set this value to TRUE for large data.

nimpute

Number of iterations of missing data algorithm.

predictorWt

Weights for selecting variables for splitting on.

seed

Seed for random number generator.

do.trace

Should trace output be enabled?

...

Further arguments passed to or from other methods.

Details

Grows a RSF and uses this to impute missing data. All external calculations such as ensemble calculations, error rates, etc. are turned off. Use this function if your only interest is imputing the data.

All options are the same as for rsf.

Value

Invisibly, the data frame containing the orginal data with imputed data overlayed.

Author(s)

Hemant Ishwaran hemant.ishwaran@gmail.com

Udaya B. Kogalur kogalurshear@gmail.com

References

Ishwaran H., Kogalur U.B., Blackstone E.H. and Lauer M.S. (2008). Random survival forests, Ann. App. Statist., 2:841-860.

See Also

rsf.

Examples

## Not run: 
data(pbc, package = "randomSurvivalForest")
imputed.data <- impute.rsf(Surv(days, status) ~ ., data = pbc, nsplit = 3)

## End(Not run)

[Package randomSurvivalForest version 3.6.3 Index]