Find parameter estimates in an independent strata Generalized Linear Model using projected score equations. DESCRIPTION: Produces an object of class "psglm" which contains components of the solution to projected score equations for a Generalized Linear Model. USAGE: psglm(nu = matrix(1, nrow = dim(y)[1], ncol = dim(y)[2]), y, x, family = gaussian, weights = rep(1, length(y[, 1])), start = NULL, expinfo = T, conditional = T, paramtest = NULL, control = glm.control(epsilon = 0.0001, maxit = 100,...), ...) REQUIRED ARGUMENTS: y: a matrix of response variables with rows corresponding to strata. x: either a three dimensional array of covariates or a matrix of covariates. OPTIONAL ARGUMENTS: nu: A matrix of trials for when y is binomial. family: a family object - a list of functions and expressions for defining the link and inverse functions etc. Cumulants are obtained by turning this into a familytoo object. weights: a vector of weights, one for each stratum. start: a list of two components. The first contains initial estimates for the parameters of interest -- the within strata parameters, the second contains initial estimates for the nuisance parameters -- the parameters associated with the strata totals. conditional: when set to T the projected score equations are solved giving approximate conditional estimates. When set to F the full maximum likelihood score equations are solved. expinfo: a flag to indicate if the computationally faster maximum likelihood score information should be used to solve the projected score equations. If set to F then the projected score efficient information is calculated which requires a numerical differentiation. paramtest: a vector of integers denoting which parameters of interest should be tested using a projected score test. control: a list of iteration and algorithmic constants. See glm.control for their names and default values. In addition to the controls used in glm, psglm has to define a step size for the numerical differentiations. This is taken to be sqrt(eps). VALUE: An object of class "psglm". DETAILS: The output can be examined by print and summary. Some checking is done of the input but not a major amount. In particular no check is done for "conditionally uninformative strata" which at present will cause the estimation routine to run into trouble. These uninformative strata could be removed in a pre-processing step. Parameter estimates for a Generalized Linear Model are obtained by solving projected score equations -- resulting in approximately conditional estimates. The solutions are obtained by iterating between estimating the parameters of interest and estimating the nuisance parameters using either a Fisher scoring or a Newton-Raphson algorithm. The data is assumed to be described by a two way array in which the rows/strata are independent and conditional on an individual stratum parameter, the within strata observations are also independent. Standard examples are conditional logistic regression in case control studies and the one-parameter Rasch model. At present only canonical links are allowed for. Weights are available and have two uses. Firstly they are valuable in reducing the dimensionality of some data sets; identical strata (both response and covariates) can be collapsed into a single stratum with a weight that represents the number of collapsed strata. Secondly dispersion parameters for the Gaussian and Gamma families can be incorporated using weights equal to the inverse of the dispersion parameter. A projected score test is available for hypothesis testing. Currently the null hypothesis is that the selected parameters are equal to zero. There are differences between psglm and glm output because projected scores do not have to correspond to any likelihood and therefore notions such as deviance do not carry over in an unambiguous way. If there is a single stratum, and "conditional" is set to F, then the parameter estimates and standard errors from psglm and glm should be identical. (One caveat: the link for the Gamma family is taken to be -1/mu, so -psglm = glm.) Since psglm uses higher order asymptotics, in particular the third and fourth cumulant functions are needed, some of the common methods used to implement glm fail in psglm. An example is the way that binomial data is treated in glm -- the binomial trials are subsumed into the weights because they only enter the maximum likelihood score equations in a linear manner. In psglm they enter the projected score equation in a non-linear fashion and therefore can not be factored out into the weights. REFERENCES: Waterman, R.P. and Lindsay, B.G. (1996). A simple and accurate method for approximate conditional inference in generalized linear models. JRSSB 58: 177-78. Waterman, R.P. and Lindsay, B.G. (1996). The accuracy of projected score methods in approximating conditional scores. Biometrika 83: 1-14. Small, C.G. and McLeish, D.L. (1989). Projection as a method for increasing sensitivity and eliminating nuisance parameters. Biometrika 76: 693-703. SEE ALSO: familytoo, cumulants. EXAMPLES: # Conditional estimation of item parameters in a Rasch model. source("raschwei.q") summary(psglm(y=y,x=z,family=binomial,weights=w,trace=T, paramtest=c(1,2)),correlation=T) # Estimating a log-odds ratio in a proportional hazards model source("coxreg.q") summary(psglm(nu=n,y=y,x=a,family=binomial,paramtest=c(1))) # An exact conditional estimate of the log-odds ratio for paired # Bernoulli trials. The estimate should be 0.5*log(4/7) and the # estimated standard error 0.5* sqrt(1/7 + 1/4) y <- matrix(c(1,0,0,1),byrow=T,ncol=2) w <- c(4,7) z <- matrix(c(1,-1),ncol = 1) psglm(y = y,x = z,family=binomial,weights = w,expinfo=F)