TY - GEN
T1 - Error and Correlation as fitness functions for Scaled Symbolic Regression in Grammatical Evolution
AU - Muphy, Aidan
AU - Dias, Douglas Mota
AU - De Lima, Allan
AU - Ryan, Conor
N1 - Publisher Copyright:
© 2023 Copyright held by the owner/author(s).
PY - 2023/7/15
Y1 - 2023/7/15
N2 - Linear scaling has greatly improved the performance of genetic programming when performing symbolic regression. Linear scaling transforms the output of an expression to reduce its error. Mean squared error and correlation have been used with scaling, often interchangeably and with assumed equivalence. We examine if this equivalence is justified by investigating the differences between an error-based metric and a correlation-based metric on 11 well-known symbolic regression benchmarks. We investigate the effect a change of fitness function has on performance, individuals size and diversity. Error-based scaling and Correlation were seen to attain equivalent performance and found solutions with very similar size and diversity on the majority of problem, but not all. In order to ascertain if the strengths of both approaches could be combined, we explored a double tournament selection strategy, where two tournaments are conducted sequentially to select individuals for recombination. Double tournament selection found smaller solutions and the best solution in five benchmarks, including finding the best solutions on both real-world dataset used in our experiments.
AB - Linear scaling has greatly improved the performance of genetic programming when performing symbolic regression. Linear scaling transforms the output of an expression to reduce its error. Mean squared error and correlation have been used with scaling, often interchangeably and with assumed equivalence. We examine if this equivalence is justified by investigating the differences between an error-based metric and a correlation-based metric on 11 well-known symbolic regression benchmarks. We investigate the effect a change of fitness function has on performance, individuals size and diversity. Error-based scaling and Correlation were seen to attain equivalent performance and found solutions with very similar size and diversity on the majority of problem, but not all. In order to ascertain if the strengths of both approaches could be combined, we explored a double tournament selection strategy, where two tournaments are conducted sequentially to select individuals for recombination. Double tournament selection found smaller solutions and the best solution in five benchmarks, including finding the best solutions on both real-world dataset used in our experiments.
KW - Grammatical Evolution
KW - Linear Scaling
KW - Symbolic Regression
UR - http://www.scopus.com/inward/record.url?scp=85168991808&partnerID=8YFLogxK
U2 - 10.1145/3583133.3590709
DO - 10.1145/3583133.3590709
M3 - Conference contribution
AN - SCOPUS:85168991808
T3 - GECCO 2023 Companion - Proceedings of the 2023 Genetic and Evolutionary Computation Conference Companion
SP - 607
EP - 610
BT - GECCO 2023 Companion - Proceedings of the 2023 Genetic and Evolutionary Computation Conference Companion
PB - Association for Computing Machinery, Inc
T2 - 2023 Genetic and Evolutionary Computation Conference Companion, GECCO 2023 Companion
Y2 - 15 July 2023 through 19 July 2023
ER -