Efficient interleaved sampling of training data in genetic programming

R. Muhammad Atif Azad, David Medernach, Conor Ryan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The ability to generalize beyond the training set is important for Genetic Programming (GP). Interleaved Sampling is a recently proposed approach to improve generalization in GP. In this technique, GP alternates between using the entire data set and only a single data point. Initial results showed that the technique not only produces solutions that generalize well, but that it so happens at a reduced computational expense as half the number of generations only evaluate a single data point. This paper further investigates the merit of interleaving the use of training set with two alternatives approaches. These are: the use of random search instead of a single data point, and simply minimising the tree size. Both of these alternatives are computationally even cheaper than the original setup as they simply do not invoke the fitness function half the time. We test the utility of these new methods on four, well cited, and high dimensional problems from the symbolic regression domain. The results show that the new approaches continue to produce general solutions despite taking only half the fitness evaluations. Size minimisation also prevents bloat while producing competitive results on both training and test data sets. The tree sizes with size minisation are substantially smaller than the rest of the setups, which further brings down the training costs.

Original languageEnglish
Title of host publicationGECCO 2014 - Companion Publication of the 2014 Genetic and Evolutionary Computation Conference
PublisherAssociation for Computing Machinery
Pages127-128
Number of pages2
ISBN (Print)9781450328814
DOIs
Publication statusPublished - 2014
Event16th Genetic and Evolutionary Computation Conference Companion, GECCO 2014 Companion - Vancouver, BC, Canada
Duration: 12 Jul 201416 Jul 2014

Publication series

NameGECCO 2014 - Companion Publication of the 2014 Genetic and Evolutionary Computation Conference

Conference

Conference16th Genetic and Evolutionary Computation Conference Companion, GECCO 2014 Companion
Country/TerritoryCanada
CityVancouver, BC
Period12/07/1416/07/14

Keywords

  • Computational efficiency
  • Genetic Programming
  • Interleaved Sampling
  • Over-fitting
  • Robustness of solutions
  • Speedup technique

Fingerprint

Dive into the research topics of 'Efficient interleaved sampling of training data in genetic programming'. Together they form a unique fingerprint.

Cite this