BY RICHARD WATERMAN
Department of Statistics,
Wharton School, University of Pennsylvania,
DONALD RUBIN
Department of Statistics, Harvard University,
NEAL THOMAS
Datametrics Research Inc.
ANDREW GELMAN
Department of Statistics, Columbia University.
Draft of June 9th 1999
The authors acknowledge the contribution of many individuals to both the design and construction of the model. Paul Kleindorfer and Michael Crew were instrumental to the design of the underlying econometric models. Without the inputs from the U.S. Postal Service, provided by Ross Bailey, John Reynolds and their staff, there would be no data on which to run the model. The members of the LINX DQS team turned the Postal data into formats amenable for input to the economic models. Many discussions with both the Postal Rate Commission and the General Accounting Office helped steer the modeling in relevant directions.
Our involvement with the Postal Service costing dialog began as members of the LINX team on a large scale Data Quality Study (DQS) of Postal Service data inputs to the Postal Service rate making process. This study took place from June 1997 to April 1999 and is fully described in the Summary Report and four supporting Technical Reports 1. Though the study was broad in its approach, including economic, statistical and an Industry Survey studies, one component involved the construction of a simulation model to investigate a variety of questions including the overall quality of specific marginal cost estimates, as well as an examination of issues and concerns raised by intervenors during various Postal Service rate hearings. This paper describes the rationale for the simulation model, explains the key ideas on which it is founded, and illustrates its use. Furthermore, it expands on some of the insights provided by the model.
In particular, among the benefits of the simulation model approach are that it forces the user to think hard about their assumptions and to focus on what exactly it is that needs to be measured. Thereby it may provide a means of exploring conjectures and their consequences, of different or even opposing viewpoints.
Accurate costing of products is an essential activity within any large company with a diverse product mix. It is a key requirement for identifying the organizations ultimate profitability. A diverse and complex product mix is likely to require an involved process to reveal individual product costs. In these circumstances it can be a major achievement simply to arrive at a product level cost estimate. However, there is a second and even more demanding dimension to the cost estimation process: to ask how reliably (described in terms of precision and accuracy) those costs have been estimated. If we agree that it is important to estimate costs, then it is clearly equally important to quantify the quality of those cost estimates. Cooper and Kaplan (1991) discuss possible reasons for, and the impact of, measurement errors in cost management systems.
The simulation model is one way to approach this second-level question - the question that asks, not simply ``how should we estimate costs'', but adds caveats ``how well have these costs been estimated'', and ``what are the likely consequences of potential errors in the cost estimation process''.
discuss possible reasons for, and the impact of, measurement errors in cost management systems.
The simulation model is one way to approach this second-level question - the question that asks, not simply ``how should we estimate costs'', but adds caveats ``how well have these costs been estimated'', and ``what are the likely consequences of potential errors in the cost estimation process''.
In addition, the annual Cost and Revenue Analysis (CRA) presents attributable costs from numerous categories of mail and services. Though the CRA costs are based on accounting records, the accounts do not differentiate the costs by class and subclass of mail. In order to provide this breakdown by mail class and subclass, additional sources of information have to be utilized. These sources include large scale multi-stage sample surveys, operating data systems and special purpose econometric studies. Data from these sources most often makes their appearance in (i) the distribution keys used to distribute the attributable cost, and (ii) the elasticities of accrued component cost with respect to the cost driver. Due to the diversity of the inputs to the cost calculations, it is extremely difficult to identify analytically the quality of the resulting cost estimates, despite being a very legitimate question to ask. Further, one of the cost measures of interest to the Postal Service, the marginal cost estimate (Unit Volume Variable Cost in Postal Service parlance) is calculated by combining four multiplicative factors:
An estimate of each of the four components must be derived, and it is not at all obvious how the uncertainty in each component relates to the uncertainty in the overall marginal cost estimate. One immediate practical application that results from measuring this uncertainty is that it enables the analyst to begin addressing the following question ``if there were one million dollars to spend on improved information collection, where should those dollars be spent - better elasticity estimates, distribution key shares or volume estimates''. Furthermore, the simulation model helps direct the analyst to the specific cost components (for example Delivery, Transportation or Mail Processing) where better component estimates would provide significantly better overall cost estimates.
Information Technology is often cited as the key driver of current productivity increases. A sometimes overlooked component is the raw material of the IT system itself, that is the data/information that these systems work with. If the IT system is ideally a machine that constructs knowledge, then what the DQS looked at was the sometimes less than glamorous, but clearly essential, raw material inputs to the machine. Simply put, without quality inputs there are unlikely to be quality outputs.
Value from an endeavor often arises indirectly, even serendipitously, and that appears to have been the case during the implementation of the simulation model. The reason why value from the model may be gained indirectly, is that the construction of an acceptable cost estimation model requires dialog concerning:
Two examples follow that illustrate the manner in which the model lead to potentially useful insights. The project had initially focused entirely on marginal cost estimates for the mail subclasses of interest. After consideration of results from the model, parties to the project began to focus attention also on relative marginal costs, which represented a major change in the main outcome measure of interest. Further, concern had been expressed over the consequences from a recent decrease in data collection resources in the core statistical sampling systems. The model simulation model suggested that this was not of primary concern, because there were other components in the cost estimation process that contributed more to the overall uncertainty of cost estimates. This example illustrates how the simulation model offered the potential to focus attention on those parts of the process that were most influential with respect to the outcome of interest.
Cooper and Kaplan. (1991). The Design of Cost Management Systems. Prentice Hall, Englewood Cliffs, NJ.
Richard Waterman