dperm {exactRankTests} | R Documentation |
Density, distribution function and quantile function for the distribution of permutation tests using the Shift-Algorithm by Streitberg and R"ohmel.
dperm(x, scores, m, paired=NULL, tol = 0.01, fact=NULL) pperm(q, scores, m, paired=NULL, tol = 0.01, fact=NULL) pperm2(q, scores, m, paired=NULL, tol = 0.01, fact=NULL) qperm(p, scores, m, paired=NULL, tol = 0.01, fact=NULL) rperm(n, scores, m)
x, q |
vector of quantiles. |
p |
vector of probabilities. |
scores |
ranks, midranks or real valued scores of the observations
of the x (first m elements) and y sample. |
m |
sample size of the x sample. If m = length(x)
scores of paired observations are assumed. |
paired |
logical. Indicates if paired observations are used. Needed
to discriminate between a paired problem and the distribution of the total sum
of the scores (which has mass 1 at the point sum(scores) ) |
tol |
real. real valued scores are mapped into integers by multiplication.
Make sure that the absolute difference between the "true" quantile and the
approximated quantile is less than tol . This might not be possible
due to memory/time limitations. |
fact |
real. If fact is given, real valued scores are mapped into
integers using fact as factor. tol is ignored. |
n |
number of observations. |
The exact distribution of the sum of the first m
scores is
evaluated using the Shift-Algorithm by Streitberg and R"ohmel under the
hypothesis of exchangeability (or, equivalent, the hypothesis that all
permutations of the scores are equally likely). The algorithm is able
to deal with tied scores, so the conditional distribution can be
evaluated.
The algorithm is defined for positive integer valued scores only.
There are two ways dealing with real valued scores.
First, one can try to find integer
valued scores that lead to quantiles which differ not more than tol
from the quantiles induced by the original scores. This can be done as
follows.
Without loss of generality let a_i > 0 denote real valued scores and f a positive factor. Let R_i = a_i - round(f cdot a_i). Then
sum_{i=1}^m f cdot a_i = sum_{i=1}^m round(f cdot a_i) - R_i.
Clearly, the maximum difference between sum_{i=1}^m f cdot a_i and sum_{i=1}^n round(f cdot a_i) is given by |sum_{i=1}^m R_{(i)}| or |sum_{i=m+1}^N R_{(i)}|, respectively. Therefore one searches for f with
max(|sum_{i=1}^m R_{(i)}|, |sum_{i=m+1}^N R_{(i)}|)/f <= tol.
If f induces more that 20.000 columns in the Streitberg-R"ohmel Shift-Algorithm, f is restricted to the largest integer that does not.
The second idea is to map the scores itself into {1, ..., N}. This induces additional ties, but the shape of the scores is very similar. That means we do not try to approximate something but use a different test (with integer scores), serving for the same purpose (due to a similar shape of the scores).
Exact two-sided p-values are computed as suggested in the StatXact-4 manual, page 209, equation (9.32).
dperm
gives the density, pperm
gives the distribution
function and qperm
gives the quantile function. pperm2
can be
used for the calculation of two-sided p-values. rperm
is
just a one-line wrapper to sample
.
Torsten Hothorn <Torsten.Hothorn@rzmail.uni-erlangen.de>
Bernd Streitberg and Joachim R"ohmel (1986). Exact Distributions For Permutations and Rank Tests: An Introduction to Some Recently Published Algorithms. Statistical Software Newsletter 12, No. 1, 1017.
Bernd Streitberg and Joachim R"ohmel (1987) Exakte Verteilungen f"ur Rang- und Randomisierungstests im allgemeinen $c$-Stichprobenfall. EDV in Medizin und Biologie 18, No. 1, 1219.
# exact one-sided p-value of the Wilcoxon test for a tied sample x <- c(0.5, 0.5, 0.6, 0.6, 0.7, 0.8, 0.9) y <- c(0.5, 1.0, 1.2, 1.2, 1.4, 1.5, 1.9, 2.0) r <- rank(c(x,y)) pperm(sum(r[seq(along=x)]), r, 7) # Compare the exact algorithm as implemented in ctest and the # Streitberg-Roehmel for untied samples # Wilcoxon: n <- 10 x <- rnorm(n, 2) y <- rnorm(n, 3) r <- rank(c(x,y)) # exact distribution using Streitberg-Roehmel dwexac <- dperm((n*(n+1)/2):(n^2 + n*(n+1)/2), r, n) su <- sum(dwexac) # should be something near 1 :-) su if (su != 1) stop("sum(dwexac) not equal 1") # exact distribution using dwilcox dw <- dwilcox(0:(n^2), n, n) # compare the two distributions: plot(dw, dwexac, main="Wilcoxon", xlab="dwilcox", ylab="dperm") # should give a "perfect" line # Wilcoxon signed rank test n <- 10 x <- rnorm(n, 5) y <- rnorm(n, 5) r <- rank(abs(x - y)) pperm(sum(r[x - y > 0]), r, length(r)) wilcox.test(x,y, paired=T, alternative="less") psignrank(sum(r[x - y > 0]), length(r)) # Ansari-Bradley n <- 10 x <- rnorm(n, 2, 1) y <- rnorm(n, 2, 2) # exact distribution using Streitberg-Roehmel r <- rank(c(x,y)) sc <- pmin(r, 2*n - r +1) dabexac <- dperm(0:(n*(2*n+1)/2), sc, n) sum(dabexac) tr <- which(dabexac > 0) # exact distribution using dansari (wrapper to ansari.c in ctest) dab <- dansari(0:(n*(2*n+1)/2), n, n) # compare the two distributions: plot(dab[tr], dabexac[tr], main="Ansari", xlab="dansari", ylab="dperm") # real scores are allowed (but only result in an approximation) # e.g. v.d. Waerden test n <- 10 x <- rnorm(n) y <- rnorm(n) N <- length(x) + length(y) r <- rank(c(x,y)) scores <- qnorm(r/(N+1)) X <- sum(scores[seq(along=x)]) # <- v.d. Waerden normal quantile statistic # critical value, two-sided test abs(qperm(0.025, scores, length(x))) # p-values p1 <- pperm2(X, scores, length(x)) # generate integer valued scores with the same shape as normal quantile # scores, this no longer v.d.Waerden, but something very similar scores <- scores - min(scores) scores <- round(scores*N/max(scores)) X <- sum(scores[seq(along=x)]) p2 <- pperm2(X, scores, length(x)) # compare p1 and p2 p1 - p2 # the blood pressure example from StatXact: treat <- c(94, 108, 110, 90) contr <- c(80, 94, 85, 90, 90, 90, 108, 94, 78, 105, 88) # compute the v.d. Waerden test and compare the results to StatXact: r <- rank(c(contr, treat)) sc <- qnorm(r/16) X <- sum(sc[seq(along=contr)]) round(pperm(X, sc, 11), 4) # == 0.0462 (StatXact) round(pperm2(X, sc, 11), 4) # == 0.0799 (StatXact) # the alternative method returns: sc <- sc - min(sc) sc <- round(sc*16/max(sc)) X <- sum(sc[seq(along=contr)]) round(pperm(X, sc, 11), 4) # compare to 0.0462 round(pperm2(X, sc, 11), 4) # compare to 0.0799 # check if [qp]wilcox and [pq]signrank are equal with [qp]perm source(system.file("test1.R", pkg="exactRankTests")[1]) source(system.file("test2.R", pkg="exactRankTests")[1])