EuroGP 2026 · EvoStar

Revisiting SLIM

Improved Learning Dynamics and Model Compactness
in Symbolic Regression

Lachlan Stewart (presenting)1  ·  Gorka Silva2  ·  Leonardo Trujillo3  ·  Illya Bakurov4  ·  Mauro Castelli5  ·  Davide Farinati5  ·  Jose Manuel Muñoz Contreras3  ·  Leonardo Vanneschi5

1 Australian National University  ·  2 Universidad Complutense de Madrid  ·  3 Tecnológico Nacional de México  ·  4 Michigan State University  ·  5 NOVA IMS, Universidade Nova de Lisboa

Université Toulouse Capitole, Rempart Building

Toulouse, France  ·  9 April 2026

A 2025 SPECIES Summer School Production

The Tension in Symbolic Regression

x₁x₂x₃?ŷ

Deep Learning · Ensembles

Accurate but opaque

×+sinx₁x₂x₃

Genetic Programming

Interpretable but hard to search

Can we get both?

Vanneschi, L. (2024). SLIM_GSGP: The Non-bloating Geometric Semantic Genetic Programming. In Genetic Programming, Giacobini, Xue & Manzoni (Eds.). Springer Nature Switzerland, 125–141.

Geometric Semantic GP

output on training case 1output on training case 2Semantic (output) space — each axis = model output on one casetarget(desired output)parent(current model)offspring± msUnimodal error surface→ in semantic space, we always know that local progresscan be made towards the target

ABS mutation:

= random tree semantics;
each component bounded in (, )

Moraglio, A., Krawiec, K. & Johnson, C.G. (2012). Geometric Semantic Genetic Programming. In Parallel Problem Solving from Nature — PPSN XII, Coello Coello et al. (Eds.). LNCS Vol. 7491. Springer Berlin Heidelberg, 21–31.

But each GSGP mutation adds syntax — models grow at every step

Linear Combination of Nonlinear Functions

+x₁x₂g1×x₂x₃g2sinx₁g3+x₁sinx₂g4×+x₃x₁x₂g5cos+x₂x₃g6+sinx₂x₁g7×x₁cosx₃g8expx₂g9+×x₃x₁x₂g10×x₁expx₂g11cosx₃g12random cleast squares fit
Select basis functions at random
Combine:
One approach: pick random functions, then solve for optimal via least squares

SLIM GSGP Does It a Different Way

g1g2g3g4g5g6g7g8g9g10g11g12

Vanneschi, L. (2024). SLIM_GSGP: The Non-bloating Geometric Semantic Genetic Programming. In Genetic Programming, Giacobini, Xue & Manzoni (Eds.). Springer Nature Switzerland, 125–141.

IGSM adds — improves fit
IGSM adds — model grows
DGSM removes (redundant) · IGSM adds
Model: — compact, pruned of redundancy

How SLIM Works

+c₁g₁c₁g₁c₂g₂IGSMc₁g₁c₂g₂c₃g₃IGSMc₁g₁c₂g₂DGSMc₃g₃c₁g₁c₄g₄IGSMc₃g₃

Vanneschi, L. (2024). SLIM_GSGP: The Non-bloating Geometric Semantic Genetic Programming. In Genetic Programming, Giacobini, Xue & Manzoni (Eds.). Springer Nature Switzerland, 125–141.

What could be improved about SLIM GSGP?

OMS Coefficients chosen randomly — why not optimally?
LS No global rescaling of the full output
PT No size pressure during selection — trees bloat
AS Final expression is never simplified

Optimal Mutation Step (OMS)

solve via pseudoinverse
Semantic space — 2D projectionS_RresidualT (target)P (parent)P + ms*·S_R

Clip extreme values  ·  zero out negligible ones → mutation cancelled → implicit size reduction

Ivo Gonçalves, Sara Silva, and Carlos M. Fonseca. 2015. On the Generalization Ability of Geometric Semantic Genetic Programming. In Genetic Programming, Penousal Machado, Malcolm I. Heywood, James McDermott, Mauro Castelli, Pablo García-Sánchez, Paolo Burelli, Sebastian Risi, and Kevin Sim (Eds.). Springer International Publishing, Cham, 41–52.

James McDermott, Alexandros Agapitos, Anthony Brabazon, and Michael O’Neill. 2014. Geometric semantic genetic programming for financial data. In Applications of Evolutionary Computation: 17th European Conference, EvoApplications 2014, Granada, Spain, April 23-25, 2014, Revised Selected Papers 17. Springer, 215–226.

Linear Scaling (LS)

Fit slope and intercept via OLS on training data. Evolution searches for shape, not scale.

Without LSWith LS

OMS: local step optimality   ·   LS: global calibration   ·   Complementary

Maarten Keijzer. 2003. Improving Symbolic Regression with Interval Arithmetic and Linear Scaling. In Genetic Programming, Conor Ryan, Terence Soule, Maarten Keijzer, Edward Tsang, Riccardo Poli, and Ernesto Costa (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 70–82

Giorgia Nadizar, Fraser Garrow, Berfin Sakallioglu, Lorenzo Canonne, Sara Silva, and Leonardo Vanneschi. 2023. An Investigation of Geometric Semantic GP with Linear Scaling. In Proceedings of the Genetic and Evolutionary Computation Conference (Lisbon, Portugal) (GECCO ’23). Association for Computing Machinery, New York, NY, USA, 1165–1174. doi:10.1145/3583131.3590418

Pareto Tournament Selection

Model size →Error (RMSE) →fitness onlyTournament selects from non-dominated setBSOCBFBF = Best Fitness · BS = Best Size · OC = Optimal Compromise

* BF, OC, BS are used for selection from the final population, not during each evolutionary step

Works best when combined with LS and OMS

Edwin D. de Jong and Jordan B. Pollack. 2003. Multi-Objective Methods for Tree Size Control. Genetic Programming and Evolvable Machines 4, 3 (Sept. 2003), 211–233. doi:10.1023/a:1025122906870

Mark Kotanchek, Guido Smits, and Ekaterina Vladislavleva. 2007. Pursuing the Pareto Paradigm: Tournaments, Algorithm Variations and Ordinal Optimization. Springer US, 167–185. doi:10.1007/978-0-387-49650-4_11

Guido F. Smits and Mark Kotanchek. 2005. Pareto-Front Exploitation in Symbolic Regression. Springer US, Boston, MA, 283–299. doi:10.1007/0-387-23254-0_17

Automatic Algebraic Simplification

Applied to Pareto-front models before final selection.

Modest effect alone — ABS and SIG functions resist simplification. Amplifies OMS: zeroed mutation steps get cleaned up.

How did we study all of these changes simultaneously?

we studied the
baseline SLIM

BASE

we studied the addition
of each to BASE

BASE + OMS
BASE + LS
BASE + PT
BASE + AS

we studied the combination
of all additions

ALL

we studied the subtraction
of each from ALL

ALL − OMS
ALL − LS
ALL − PT
ALL − AS

Experimental Setup

AirfoilS: 1502 · F: 5 Bike SharingS: 730 · F: 14 BioavailabilityS: 359 · F: 241 BostonS: 506 · F: 13 Breast CancerS: 569 · F: 30 Concrete SlumpS: 103 · F: 7 Concrete StrengthS: 1030 · F: 8 DiabetesS: 442 · F: 10 Efficiency CoolingS: 768 · F: 8 Efficiency HeatingS: 768 · F: 8 Forest FiresS: 517 · F: 12 Parkinson UpdrsS: 5875 · F: 19 PPBS: 131 · F: 626 Resid Build Sale PriceS: 1460 · F: 79
1 For each of the 10 variants, we performed 30 independent runs on every dataset.
2 From each run's final population we extracted the BS, BF, and OC solutions — recording the percentage improvement in accuracy and size vs the baseline.
3 The median was taken across all runs and datasets to give the aggregated improvements shown next.
4 Parameters: population 100 · generations 100 · tournament size 2 (5 for PT) ·

Results: Accuracy vs Size Tradeoff

-100%-80%-60%-40%-20%0%20%-40%-20%0%20%Mean change in size (%)Mean change in RMSE (%)← smaller & more accurateMarker typesBF (Best Fitness)OC (Opt. Compromise)BS (Best Size)SLIM 2024 BaselineALLALL − ASALL − LSALL − OMSALL − PTBASEBASE + ASBASE + LSBASE + OMSBASE + PT

Future Work

  • Early stopping — overfitting observed in some experiments; a promising mitigation strategy, though estimating the optimal stopping criterion is non-trivial
  • SRBench — evaluate our approach on this well-established symbolic regression benchmark

Conclusions

Four simple, well-known enhancements combine synergistically to evolve more accurate and dramatically smaller models

OMS Optimal coefficients
LS Global affine correction
PT Size-aware selection
AS Algebraic simplification of final model

Thank You

Co-Authors

Gorka Silva1  ·  Lachlan Stewart 2  ·  Leonardo Trujillo3  ·  Illya Bakurov4  ·  Mauro Castelli5  ·  Davide Farinati5  ·  Jose Manuel Muñoz Contreras3  ·  Leonardo Vanneschi5

1 Universidad Complutense de Madrid  ·  2 Australian National University  ·  3 Tecnológico Nacional de México  ·  4 Michigan State University  ·  5 NOVA IMS, Universidade Nova de Lisboa

Special thanks to Leonardo Trujillo for outstanding mentorship throughout the 2025 SPECIES Summer School and beyond.

Thank you to SPECIES and its organisers for putting on the conference and the Summer School. Apply for 2026! species-society.org/summer-school-2026

EuroGP 2026 · EvoStar · Toulouse, France