COST‐BENEFIT CONSIDERATIONS IN CHOOSING AMONG CROSS‐VALIDATION METHODS

Research output: Contribution to journalArticlepeer-review

Abstract

There are two general methods of cross‐validation: (a) empirical estimation, and (b) formula estimation. In choosing a specific cross‐validation procedure, one should consider both costs (eg. inefficient use of available data in estimating regression parameters) and benefits (eg. accuracy in estimating population cross‐validity). Empirical cross‐validation methods involve significant costs, since they are typically laborious and wasteful of data, but under conditions represented in Monte Carlo studies, they are generally not more accurate than formula estimates. Consideration of costs and benefits suggests that empirical estimation methods are typically not worth the cost, except in a limited number of cases in which Monte Carlo sampling assumptions are not met in the derivation sample. Designs which use multiple samples to estimate the cross‐validity of a single regression equation are clearly preferable to single‐sample designs; the latter are never expected to be more accurate than formula estimates and thus are never worth the cost. Multi‐equation designs are more accurate than single equation designs, but they appear to estimate the wrong parameter, and thus are difficult to interpret.

Original languageEnglish
Pages (from-to)15-22
Number of pages8
JournalPersonnel Psychology
Volume37
Issue number1
DOIs
Publication statusPublished - Mar 1984
Externally publishedYes

Fingerprint

Dive into the research topics of 'COST‐BENEFIT CONSIDERATIONS IN CHOOSING AMONG CROSS‐VALIDATION METHODS'. Together they form a unique fingerprint.

Cite this