A bootstrapping approach to reduce over-fitting in genetic programming

Jeannie Fitzgerald, R. Muhammad Atif Azad, Conor Ryan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Historically, the quality of a solution in Genetic Programming (GP) was often assessed based on its performance on a given training sample. However, in Machine Learning, we are more interested in achieving reliable estimates of the quality of the evolving individuals on unseen data. In this paper, we propose to simulate the effect of unseen data during training without actually using any additional data. We do this by employing a technique called bootstrapping that repeatedly re-samples with replacement from the training data and helps estimate sensitivity of the individual in question to small variations across these re-sampled data sets. We minimise this sensitivity, as measured by the Bootstrap Standard Error, together with the training error, in an effort to evolve models that generalise better to the unseen data. We evaluate the proposed technique on four binary classification problems and compare with a standard GP approach. The results show that for the problems undertaken, the proposed method not only generalises significantly better than standard GP while the training performance improves, but also demonstrates a strong side effect of containing the tree sizes.

Original languageEnglish
Title of host publicationGECCO 2013 - Proceedings of the 2013 Genetic and Evolutionary Computation Conference Companion
Pages1113-1120
Number of pages8
DOIs
Publication statusPublished - 2013
Event15th Annual Conference on Genetic and Evolutionary Computation, GECCO 2013 - Amsterdam, Netherlands
Duration: 6 Jul 201310 Jul 2013

Publication series

NameGECCO 2013 - Proceedings of the 2013 Genetic and Evolutionary Computation Conference Companion

Conference

Conference15th Annual Conference on Genetic and Evolutionary Computation, GECCO 2013
Country/TerritoryNetherlands
CityAmsterdam
Period6/07/1310/07/13

Keywords

  • Binary classification
  • Generalisation
  • Genetic programming

Fingerprint

Dive into the research topics of 'A bootstrapping approach to reduce over-fitting in genetic programming'. Together they form a unique fingerprint.

Cite this