dist2 {genefilter} | R Documentation |
Calculate an n-by-n matrix by applying a function to pairs of columns of an m-by-n matrix.
dist2(x, fun=function(a,b) mean(abs(a-b), na.rm=TRUE), diagonal=0)
x |
A matrix, or any object |
fun |
A symmetric function of two arguments that may be
columns of |
diagonal |
The value to be used for the diagonal elements of the resulting matrix. |
With the default value of fun
, this function calculates
for each pair of columns of x
the mean of the absolute values
of their differences (which is proportional to the L1-norm of their
difference). This is a distance metric.
The implementation assumes that fun
is symmetric,
fun(a,b)=fun(b,a)
. Hence, the
returned matrix is symmetric.
fun(a,a)
is not evaluated, instead the value of diagonal
is used to fill the diagonal elements of the returned matrix.
A use for this function is the detection of outlier arrays in a
microarray experiment. Assume that each column of x
can be
decomposed as
z+β+ε, where z is a fixed vector
(the same for all columns), ε is vector of
nrow{x}
i.i.d. random numbers, and β is an arbitrary
vector whose majority of entries are negligibly small (i.e. close to
zero). In other words, Dz the probe effects, ε
measurement
noise and β differential expression effects. Under this assumption, all entries of the resulting distance matrix should be the same, namely a multiple of the standard deviation of ε. Arrays whose distance matrix entries are way different give cause for suspicion.
A symmetric matrix of size n x n
.
z = matrix(rnorm(15693), ncol=3) dist2(z)