Supplement 2 for: A new u-statistic with superior
design sensitivity in matched observational studies.

This supplement contains an R session and R code.  It
is provided without guarantees of any kind.

These are the 21 matched pair differences for the 
methotrexate example in section 1.2 of the paper.

> dif
 [1]  0.03 -0.05 -0.02  0.06  0.00  0.05  0.03  0.08 -0.08
[10] -0.06  0.16 -0.03  0.45  0.08  0.25 -0.64  0.32  0.12
[19]  0.38  0.14  0.10

The function uscore computes the ranks for matched pair
differences y with the suggested adjustement for ties
in the web based supplement.

> uscore
function(y,m=5,mb=4,mB=5){
        I<-length(y)
        q<-rep(0,I)
        a<-rank(abs(y))
        for (l in mb:mB){
                q<-q+choose(a-1,l-1)*choose(I-a,m-l)
                }
        q*(abs(y)>0)
}

The following computes the rank scores with non-monotone 
ranks (m,mb,mB) = (8,6,7) as in section 5.2 of the paper.

> q<-uscore(dif,m=8,mb=6,mB=7)

These are the rank scores scaled to have maximum 1 to aid
viewing.  Notice that the largest rank 1.000 is awarded to a 
moderately large difference of 0.25.  In contrast, with monotone 
scores, mB=m, the largest |Y| receives the largest rank.
This is an illustration.  Use monotone scores except with 
long-tailed distributions.

> round(q/max(q),3)
 [1] 0.000 0.005 0.000 0.047 0.000 0.005 0.000 0.231 0.231
[10] 0.047 0.945 0.000 0.466 0.231 1.000 0.000 0.956 0.670
[19] 0.785 0.825 0.508

The following function uses expression (3) in section 2.2
to compute the approximate upper bound on the one sided 
P-value.

> approx
function(y,m=5,mb=4,mB=5,gamma=1){
        rk<-uscore(y,m=m,mb=mb,mB=mB)*(abs(y)>0)
        T<-sum((y>0)*rk)
        ET<-sum(rk)*gamma/(1+gamma)
        VT<-sum(rk*rk)*gamma/((1+gamma)^2)
        dev<-(T-ET)/sqrt(VT)
        1-pnorm(dev)    
}

> approx(dif,m=8,mb=6,mB=7,gamma=2)
[1] 0.02910648

The value above appears in Table 2 at gamma=2, 
(8,6,7).

The functions exact and lambda compute the exact upper
bound on the P-value for small sample sizes I.  The
function exact calls lambda, and the function lambda is 
defined in section 5.2.  The function lambda is a 
recursive function: it calls itself.

> exact
function(y,m=5,mb=4,mB=5,gamma=1){
        I<-length(y)
        y<-y[order(abs(y))]
        rk<-uscore(y,m=m,mb=mb,mB=mB)*(abs(y)>0)
        T<-sum((y>0)*rk)
        lambda(I,T,rk,gamma=gamma)
}

> lambda
function(I,c,rk,gamma=1){
        #calculates Pr(T>=c) where T is a signed rank statistic with ranks in the
        #vector rk, where the user calls lambda with I=length(rk), and gamma is
        #the sensitivity parameter
        kappa<-gamma/(1+gamma)
        if (I==length(rk)) rk<-sort(rk)
        if (c<=0) 1
                else if (I<=0) 0
                else lambda(I-1,c-rk[I],rk,gamma=gamma)*kappa+lambda(I-1,c,rk,gamma=gamma)*(1-kappa)
}

The following computes the exact upper bound on the P-value
for gamma=2 using (8,6,7) as reported in section 5.2

> exact(dif,m=8,mb=6,mB=7,gamma=2)
[1] 0.02611934