dist2 {genefilter}R Documentation

Calculate an n-by-n matrix by applying a function to pairs of columns of an m-by-n matrix.

Description

Calculate an n-by-n matrix by applying a function to pairs of columns of an m-by-n matrix.

Usage

  dist2(x, fun=function(a,b) mean(abs(a-b), na.rm=TRUE), diagonal=0)

Arguments

x

A matrix, or any object x for which ncol(x) and x[,j] return appropriate results.

fun

A symmetric function of two arguments that may be columns of x.

diagonal

The value to be used for the diagonal elements of the resulting matrix.

Details

With the default value of fun, this function calculates for each pair of columns of x the mean of the absolute values of their differences (which is proportional to the L1-norm of their difference). This is a distance metric.

The implementation assumes that fun is symmetric, fun(a,b)=fun(b,a). Hence, the returned matrix is symmetric. fun(a,a) is not evaluated, instead the value of diagonal is used to fill the diagonal elements of the returned matrix.

A use for this function is the detection of outlier arrays in a microarray experiment. Assume that each column of x can be decomposed as z+β+ε, where z is a fixed vector (the same for all columns), ε is vector of nrow{x} i.i.d. random numbers, and β is an arbitrary vector whose majority of entries are negligibly small (i.e. close to zero). In other words, Dz the probe effects, ε measurement

noise and β differential expression effects. Under this assumption, all entries of the resulting distance matrix should be the same, namely a multiple of the standard deviation of ε. Arrays whose distance matrix entries are way different give cause for suspicion.

Value

A symmetric matrix of size n x n.

Examples

  z = matrix(rnorm(15693), ncol=3)
  dist2(z)

[Package genefilter version 1.36.0 Index]