On size, complexity and generalisation error in GP

Jeannie Fitzgerald, Conor Ryan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

For some time, Genetic Programming research has lagged behind the wider Machine Learning community in the study of generalisation, where the decomposition of generalisation error into bias and variance components is well understood. However, recent Genetic Programming contributions focusing on complexity, size and bloat as they relate to over-fitting have opened up some interesting avenues of research. In this paper, we carry out a simple empirical study on five binary classification problems. The study is designed to discover what effects may be observed when program size and complexity are varied in combination, with the objective of gaining a better understanding of relationships which may exist between solution size, operator complexity and variance error. The results of the study indicate that the simplest configuration, in terms of operator complexity, consistently results in the best average performance, and in many cases, the result is significantly better. We further demonstrate that the best results are achieved when this minimum complexity set-up is combined with a less than parsimonious permissible size.

Original languageEnglish
Title of host publicationGECCO 2014 - Proceedings of the 2014 Genetic and Evolutionary Computation Conference
PublisherAssociation for Computing Machinery
Pages903-910
Number of pages8
ISBN (Print)9781450326629
DOIs
Publication statusPublished - 2014
Event16th Genetic and Evolutionary Computation Conference, GECCO 2014 - Vancouver, BC, Canada
Duration: 12 Jul 201416 Jul 2014

Publication series

NameGECCO 2014 - Proceedings of the 2014 Genetic and Evolutionary Computation Conference

Conference

Conference16th Genetic and Evolutionary Computation Conference, GECCO 2014
Country/TerritoryCanada
CityVancouver, BC
Period12/07/1416/07/14

Keywords

  • Classification
  • Generalisation
  • Genetic programming

Fingerprint

Dive into the research topics of 'On size, complexity and generalisation error in GP'. Together they form a unique fingerprint.

Cite this