Bootstrapping to reduce bloat and improve generalisation in genetic programming

Jeannie Fitzgerald, R. Muhammad Atif Azad, Conor Ryan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Typically, the quality of a solution in Genetic Programming (GP) is represented by a score on a given training sample. However, in Machine Learning, we are most interested in estimating the quality of the evolving individuals on unseen data. In this paper, we propose to simulate the effect of unseen data to direct training without actually using additional data, by employing a technique called bootstrapping that repeatedly re-samples with replacement from the training data and helps estimate sensitivity of the individual in question to small variations across these re-sampled data sets. We minimise this sensitivity, as measured by the Bootstrap Standard Error, alongside the training error, in a bid to evolve models that generalise better to the unseen data. We evaluate the proposed technique on four binary classification problems and compare with a standard GP approach. The results show that for the problems undertaken, the proposed method not only generalises significantly better than standard GP while the training performance improves, but also demonstrates a strong side effect of containing the tree sizes.

Original languageEnglish
Title of host publicationGECCO 2013 - Proceedings of the 2013 Genetic and Evolutionary Computation Conference Companion
Pages141-142
Number of pages2
DOIs
Publication statusPublished - 2013
Event15th Annual Conference on Genetic and Evolutionary Computation, GECCO 2013 - Amsterdam, Netherlands
Duration: 6 Jul 201310 Jul 2013

Publication series

NameGECCO 2013 - Proceedings of the 2013 Genetic and Evolutionary Computation Conference Companion

Conference

Conference15th Annual Conference on Genetic and Evolutionary Computation, GECCO 2013
Country/TerritoryNetherlands
CityAmsterdam
Period6/07/1310/07/13

Keywords

  • Binary classification
  • Generalisation
  • Genetic programming

Fingerprint

Dive into the research topics of 'Bootstrapping to reduce bloat and improve generalisation in genetic programming'. Together they form a unique fingerprint.

Cite this