I'll soon put another paper here that offers one approach to making the distinction, but its not ready yet. Here are the slides from a recent talk.
The use of information theory in model selection is not new. The AIC (Akaike information criterion) originated as an unbiased estimate of the relative entropy, a key notion in comparing the lengths of codes. More closely tied to coding are MDL and stochastic complexity that were proposed by Rissanen.
MDL (minimum description length) is typically used in an asymptotic form which assumes many observations (large n) and fixed parameters. In this setting, MDL agrees with BIC, the large sample approximation to Bayes factors. Both penalize the likelihood by (1/2) log n for each added parameter.
This R package implements the Polyshrink estimator described in the paper.
The prior manuscript is here as well, but is missing the figures and one or two references. The new version differs from the prior manuscript in many ways. For example we no longer use subsampling, we use of binomial variances, and we have included a 5-fold cross-validation that compares the predictions of stepwise to those of the tree-based classifiers C4.5 and C5.0.
For the truly adventurous, a compressed tar file has all of the source code used for fitting the big models in this paper (written in C and a bit of C++). To build the program, you need a unix system with gcc, but the build is pretty automatic (that is, if you have done this sort of thing -- see the README file). The software is distributed under the GNU General Public License (GPL). You can get a "sanitized" portion of the data in this compressed tar file (Be patient... the file is a bit more than 24 MB.) The data layout follows the format needed by C4.5. Each file represents a fold of 100,000 cases. Further instructions are at the top of the names file.
To see a collection of papers that consider credit modeling more
generally, go to the
Wharton Financial Institutions Center
for proceedings from the Credit Risk Modeling and Decisioning conference
which was held here at Wharton, May 29-30, 2002.
A paper co-authored with J. Pickands appears in Biometrika, 84, 295-308.
A second paper considers two issues: the covariance structure of HMMs (including multivariate HMMs) and the use of model selection methods based on these covariances to find the order of the model (that is, the dimension of the underlying Markov chain). The idea is to exploit the connection between the order of the HMM and the implied ARMA structure of the covariances.
Autocovariance structure of Markov regime switching models and model selection , written with Jing Zhang (who did the hard parts), is to appear in Journal of Time Series Analysis .
Examples for each show how to do kernel density estimation, robust regression, and bootstrap resampling. Additional articles focus on three extensions of LispStat.
This paper illustrates the use of an interactive plotting tool implemented in Lisp-Stat to reveal the simple relationship among partial regression plots, component plots, and the variance inflation factor. The data sets from the paper are in the files fighters.dat and wildcats.dat .
The basic idea is that the ratio of t-statistics associated with these two plots is the square root of the associated VIF. The interactive plot shows how collinearity affects the relationship presented in regression diagnostic plots. One uses a slider to control how much of the present collinearity appears in the plot.
For more background information, have a look at my paper "An Introduction to Bootstrap Methods" (which appeared in Sociological Methods & Research back in 1989).
std::cout << make_unary_range(ret<
iz is a vector of doubles; this lambda function defines a new range with elements that
are 6.6 plus those in the range defined by the container iz.
Notice that the lambda function has to declare explicitly
its return type.
Two useful files of documentation are
axis.ps , which offers an
overview of the use of the AXIS interface, and
which shows how to extend
the interface by adding a command to perform principal components
analysis. The associated lisp files for adding commands are
Some sample data sets for use with AXIS are:
iz is a vector of doubles; this lambda function defines a new range with elements that are 6.6 plus those in the range defined by the container iz. Notice that the lambda function has to declare explicitly its return type.
Two useful files of documentation are axis.ps , which offers an overview of the use of the AXIS interface, and princomp.ps , which shows how to extend the interface by adding a command to perform principal components analysis. The associated lisp files for adding commands are
Some sample data sets for use with AXIS are:
This talk ( ppt slides ) looks at data mining from a business and modeling point of view, adding in the comments of a statistician who builds models with some experiences from doing the modeling in business problems such as financial modeling and credit risk analysis. You can get a pdf version of the powerpoint slides as well.