stripplot {mice} | R Documentation |
Plotting methods for imputed data using lattice.
bwplot
produces box-and-whisker plots, stripplot
produces one-dimensional scatterplots, densityplot
produces
plots of the densities, and xyplot
produces a conditional
scatterplots. Each function automatically separates the observed and
imputed data in a natural way. The functions extend the usual features
of lattice.
## S3 method for class 'mids' bwplot( x, data, na.groups = NULL, groups = NULL, as.table = TRUE, theme = mice.theme(), mayreplicate = TRUE, allow.multiple = TRUE, outer = TRUE, drop.unused.levels = lattice.getOption("drop.unused.levels"), ..., subscripts = TRUE, subset = TRUE) ## S3 method for class 'mids' stripplot( x, data, na.groups = NULL, groups = NULL, as.table = TRUE, theme = mice.theme(), allow.multiple = TRUE, outer = TRUE, drop.unused.levels = lattice.getOption("drop.unused.levels"), panel = lattice.getOption("panel.stripplot"), default.prepanel = lattice.getOption("prepanel.default.stripplot"), jitter.data = TRUE, horizontal = FALSE, ..., subscripts = TRUE, subset = TRUE) ## S3 method for class 'mids' densityplot( x, data, na.groups = NULL, groups = NULL, as.table = TRUE, plot.points = FALSE, theme = mice.theme(), mayreplicate = TRUE, thicker = 2.5, allow.multiple = TRUE, outer = TRUE, drop.unused.levels = lattice.getOption("drop.unused.levels"), panel = lattice.getOption("panel.densityplot"), default.prepanel = lattice.getOption("prepanel.default.densityplot"), ..., subscripts = TRUE, subset = TRUE) ## S3 method for class 'mids' xyplot( x, data, na.groups = NULL, groups = NULL, as.table = TRUE, theme = mice.theme(), allow.multiple = TRUE, outer = TRUE, drop.unused.levels = lattice.getOption("drop.unused.levels"), ..., subscripts = TRUE, subset = TRUE)
x |
A |
data |
Formula that selects the data to be plotted. This argument follows the lattice rules for formulas, describing the primary variables (used for the per-panel display) and the optional conditioning variables (which define the subsets plotted in different panels) to be used in the plot. The formula is evaluated on the complete data set in the
Extended formula interface: The primary variable terms (both the LHS
For convience, in The function |
na.groups |
An expression evaluating to a logical vector
indicating which two groups are distinguished (e.g. using
different colors) in the display. The environment in which this
expression is evaluated in the response indicator The default |
groups |
This is the usual |
plot.points |
A logical used in |
theme |
A named list containing the graphical parameters. The
default function |
mayreplicate |
A logical indicating whether color, line widths,
and so on, may be replicated. The graphical functions attempt to
choose "intelligent" graphical parameters. For example, the same
color can be replicated for different element, e.g. use all reds
for the imputed data. Replication may be switched off by setting the flag
to |
thicker |
Used in |
jitter.data |
See |
horizontal |
See |
as.table |
See |
panel |
See |
default.prepanel |
See |
outer |
See |
allow.multiple |
See |
drop.unused.levels |
See |
subscripts |
See |
subset |
See |
... |
Further arguments, usually not directly processed by the high-level functions documented here, but instead passed on to other functions. |
The argument na.groups
may be used to specify (combinations
of) missingness in any of the variables. The argument groups
can be used to specify groups based on the variable values
themselves. Only one of both may be active at the same time. When both
are specified, na.groups
takes precedence over
groups
.
Use the subset
and na.groups
together to
plots parts of the data. For example, select the first imputed data
set by by subset=.imp==1
.
Graphical paramaters like col
, pch
and cex
can be
specified in the arguments list to alter the plotting symbols. If
length(col)==2
, the color specification to define the observed
and missing groups. col[1]
is the color of the 'observed' data,
col[2]
is the color of the missing or imputed data. A
convenient color choice is col=mdc(1:2)
, a transparent blue
color for the observed data, and a transparent red color for the
imputed data. A good choice is col=mdc(1:2), pch=20,
cex=1.5
. These choices can be set for the duration of the session
by running mice.theme()
.
The high-level functions documented here, as well as other high-level
Lattice functions, return an object of class "trellis"
. The
update
method can be used to
subsequently update components of the object, and the
print
method (usually called by
default) will plot it on an appropriate plotting device.
The first two arguments (x
and data
) are reversed
compared to the standard Trellis syntax implemented in
lattice. This reversal was necessary in order to benefit from
automatic method dispatch.
In mice the argument x
is always a mids
object,
whereas in lattice the argument x
is always a formula.
In mice the argument data
is always a formula object,
whereas in lattice the argument data
is usually a data
frame.
All other arguments have identical interpretation.
densityplot
errs on empty groups, which occurs if all observations
in the subgroup contain NA
. The relevant error message is: Error in density.default:
... need at least 2 points to select a bandwidth
automatically
. There is yet no workaround for this problem. Use
the more robust bwplot
or stripplot
as a replacement.
Stef van Buuren
Sarkar, Deepayan (2008) Lattice: Multivariate Data Visualization with R, Springer. http://lmdvr.r-forge.r-project.org/
van Buuren S and Groothuis-Oudshoorn K (2011).
mice
: Multivariate Imputation by Chained Equations in R
.
Journal of Statistical Software, 45(3), 1-67.
http://www.jstatsoft.org/v45/i03/
mice
,
Lattice
for an overview of the package, as well as
xyplot
,
densityplot
,
panel.bwplot
,
panel.stripplot
,
panel.densityplot
,
panel.xyplot
,
print.trellis
,
trellis.par.set
imp <- mice(boys, maxit=2) ### box-and-whisker plot per imputation of all numerical variables bwplot(imp) ### tv (testicular volume), conditional on region bwplot(imp, tv~.imp|reg) ### same data, organized in a different way bwplot(imp, tv~reg|.imp, theme=list()) ### stripplot, all numerical variables stripplot(imp) ### same, but with improved display stripplot(imp, col=c("grey",mdc(2)),pch=c(1,20)) ### distribution per imputation of height, weight and bmi ### labeled by their own missingness stripplot(imp, hgt+wgt+bmi~.imp, cex=c(2,4), pch=c(1,20),jitter=FALSE, layout=c(3,1)) ### same, but labeled with the missingness of wgt (just four cases) stripplot(imp, hgt+wgt+bmi~.imp, na=wgt, cex=c(2,4), pch=c(1,20),jitter=FALSE, layout=c(3,1)) ### distribution of age and height, labeled by missingness in height ### most height values are missing for those around ### the age of two years ### some additional missings occur in region WEST stripplot(imp, age+hgt~.imp|reg, hgt, col=c(hcl(0,0,40,0.2), mdc(2)),pch=c(1,20)) ### heavily jitted relation between two categorical variables ### labeled by missingness of gen ### aggregated over all imputed data sets stripplot(imp, gen~phb, factor=2, cex=c(8,1), hor=TRUE) ### circle fun stripplot(imp, gen~.imp, factor=2, cex=c(8,6), hor=FALSE, na=wgt,outer=TRUE,scales="free",pch=c(1,19)) ### density plot of head circumference per imputation ### blue is observed, red is imputed densityplot(imp, ~hc|.imp) ### All combined in one panel. densityplot(imp, ~hc) ### The more powerful density plot of all ### numerical variables with at least ### two missing values. densityplot(imp) ### xyplot: scatterplot by imputation number ### observe the erroneous outlying imputed values ### (caused by imputing hgt from bmi) xyplot(imp, hgt~age|.imp, pch=c(1,20),cex=c(1,1.5)) ### same, but label with missingness of wgt (four cases) xyplot(imp, hgt~age|.imp, na.group=wgt, pch=c(1,20),cex=c(1,1.5))