yyFlenn1. Title of the Publications

Article A: 
Discovering Evolution Strategies using Meta-Black-Box Optimization

Article B: 
Discovering Attention-Based Genetic Algorithms using Meta-Black-Box Optimization


2. Author Information

Name: Robert Tjarko Lange
Institution/Company: Technical University Berlin
Email: robert.t.lange@tu-berlin.de

Name: Tom Schaul
Institution/Company: Google DeepMind
Email: schaul@google.com

Name: Yutian Chen
Institution/Company: Google DeepMind
Email: yutianc@deepmind.com

Name: Tom Zahavy
Institution/Company: Google DeepMind
Email: tomzahavy@google.com

Name: Chris Lu
Institution/Company: Oxford University
Email: luchris429@gmail.com

Name: Valentin Dalibard
Institution/Company: Google DeepMind
Email: vdalibard@google.com

Name: Satinder Singh
Institution/Company: Google DeepMind
Email: baveja@umich.edu

Name: Sebastian Flennerhag
Institution/Company: Google DeepMind
Email: flennerhag@google.com


3. Corresponding Author

Name: Robert Tjarko Lange
Technical University Berlin
email: robert.t.lange@tu-berlin.de


4. Paper Abstracts

Article A:
Optimizing functions without access to gradients is the remit of black-box methods such as evolution strategies. While highly general, their learning dynamics are often times heuristic and inflexible — exactly the limitations that meta-learning can address. Hence, we propose to discover effective update rules for evolution strategies via meta-learning. Concretely, our approach employs a search strategy parametrized by a self-attention-based architecture, which guarantees the update rule is invariant to the ordering of the candidate solutions. We show that meta-evolving this system on a small set of representative low-dimensional analytic optimization problems is sufficient to discover new evolution strategies capable of generalizing to unseen optimization problems, population sizes and optimization horizons. Furthermore, the same learned evolution strategy can outperform established neuroevolution baselines on supervised and continuous control tasks. As additional contributions, we ablate the individual neural network components of our method; reverse engineer the learned strategy into an explicit heuristic form, which remains highly competitive; and show that it is possible to self-referentially train an evolution strategy from scratch, with the learned update rule used to drive the outer meta-learning loop.

Article B:
Genetic algorithms constitute a family of black-box optimization algorithms, which take inspiration from the principles of biological evolution. While they provide a general-purpose tool for optimization, their particular instantiations can be heuristic and motivated by loose biological intuition. In this work we explore a fundamentally different approach: Given a sufficiently flexible parametrization of the genetic operators, we discover entirely new genetic algorithms in a data-driven fashion. More specifically, we parametrize selection and mutation rate adaptation as cross- and self-attention modules and use Meta-Black-Box-Optimization to evolve their parameters on a set of diverse optimization tasks. The resulting Learned Genetic Algorithm outperforms competitive baseline genetic algorithms and generalizes far beyond its meta-training settings. More specifically, the learned genetic algorithm can be applied to previously unseen optimization problems, search dimensions & evaluation budgets. We conduct extensive analysis of the discovered operators and provide ablation experiments, which highlight the benefits of flexible module parametrization and the ability to transfer ('plug-in') the learned operators to conventional genetic algorithms.


5. Competition Criteria

(D) The result is publishable in its own right as a new scientific result independent of the fact that the result was mechanically created.

(E) The result is equal to or better than the most recent human-created solution to a long-standing problem for which there has been a succession of increasingly better human-created solutions.

(F) The result is equal to or better than a result that was considered an achievement in its field at the time it was first discovered.

(G) The result solves a problem of indisputable difficulty in its field.


6. Statement Why the Results Satisfy the Criteria (D), (E), (F) and (G)

For the sake of clarity, we consider the individual competition criteria in order of importance instead of alphabetic order.

(E) The result is equal to or better than the most recent human-created solution to a long-standing problem for which there has been a succession of increasingly better human-created solutions.

A key challenge of Evolutionary Optimization (EO) research is the design of efficient black-box optimization (BBO) algorithms. Traditionally, this has been done in a semi-manual fashion, e.g. a human had to specify the operations of the optimizer based on a combination of intuition, theory and empirical evidence. Inherently this limits the EO design space and can lead algorithms that overfit to specific downstream tasks. In order to overcome this challenge we take a meta-learning ('learning to learn') perspective and introduce two new flexibly parametrized families of EO optimizers (Evolution Strategies and Genetic Algorithms). More specifically, we leverage self- and cross-attention to characterize set operations in the context of EO update and candidate sampling. The weights of the neural network operators are then meta-evolved to maximize performance on a selection of downstream tasks. The resulting meta-trained algorithms outperform competitive human-created baseline EO algorithms on both standard BBO benchmarks (e.g. BBOB and HPO-B) and challenging neuroevolution tasks (including robotic control and image classification).


(G) The result solves a problem of indisputable difficulty in its field.

Manual algorithm design significantly hinders the discovery of powerful but human-unknown EO algorithms. Our proposed meta-black-box optimization approach together with attention-based parametrization of EO algorithm families circumvents this problem and discovers new algorithms in a purely data-driven fashion. We claim that this opens up many exciting directions for future work and enables the next generation of EO research -- by shifting the research focus from manual algorithm design to meta-task distribution specification, effective meta-evolution and downstream generalization of EO.


(F) The result is equal to or better than a result that was considered an achievement in its field at the time it was first discovered.

Both the Learned Evolution Strategy (LES) and Learned Genetic Algorithm (LES) outperform a plethora of human-created baseline EO algorithms on synthetic BBO tasks and neuroevolution tasks. More specifically, our experiments show that an LES/LGA meta-evolved to perform well on a small set of BBOB functions is capable of generalizing to vastly settings. E.g. on a set of 8 MLP robotic control tasks and for a population size of 256 and an optimization horizon of 2000 generations, LES performs 20% better (in terms of normalized episode return) than 5 tuned baseline ES consisting of Open-AI ES, PGPE, ASEBO, SNES and Sep-CMA-ES (Article A, Figure 1). Furthermore, LGA significantly outperforms strong Genetic Algorithm baselines such as SAMR-GA or GESMR-GA on visual control tasks (using a CNN policy) such as the MinAtar sweep of environments (Article B, Figure 7). This highlights the strong generalization and state-of-the-art performance over human-designed baselines.


(D) The result is publishable in its own right as a new scientific result independent of the fact that the result was mechanically created.

Our main results highlight the strong meta-generalization and performance of LES/LGA to challenging EO problems, but we also want to highlight three new previously unstudied perspectives in the EO literature and introduced by our work:

a. Attention-based parametrization of set operations in the context of EO: EO update equations rely on a set of population member evaluations and have to be invariant to the ordering of such. Scaled dot-product attention provides a natural inductive bias for such operations and hence we can leverage it to introduce new perspectives on the characterization and definition of EO algorithms. 
b. Self-referential meta-evolution of Learned EO: In Article A we show that it is possible to replace the outer-loop meta-EO with an originally randomly initialized Learned ES. Together with a simple hill-climbing heuristic, we can then meta-evolve LES in a fully self-referential fashion. Whenever we observe an improvement in the inner loop evaluation of a sampled LES candidate, we replace the meta-parameters and thereby bootstrap the observed progress. The resulting optimizer is highly competitive and circumvents the necessity to choose an appropriate meta-EO.  
c. Reverse-engineering learned EO into human-interpretable operators: After successful meta-evolution we obtain a set of neural-network-based EO operators. In order to unravel what exactly has been discovered during meta-optimization, we additionally investigate the learned operators using their impulse response behaviors. Afterwards, we show that we can distill an analytical ES and even transfer individual operators to other ES or GA settings. This enables future theoretical work and inclusion into different research streams.


7. Full Citation

Article A:
Lange, Robert Tjarko, Tom Schaul, Yutian Chen, Tom Zahavy, Valentin Dalibard, Chris Lu, Satinder Singh, and Sebastian Flennerhag. "Discovering Evolution Strategies via Meta-Black-Box Optimization."
11th International Conference on Learning Representations, ICLR 2023
DOI: https://doi.org/10.48550/arXiv.2211.11260

Article B:
Lange, Robert Tjarko, Tom Schaul, Yutian Chen, Chris Lu, Tom Zahavy, Valentin Dalibard, and Sebastian Flennerhag. "Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization."
Genetic and Evolutionary Computation Conference Companion, GECCO 2023 (Best Paper Award Nomination - GA Track)
DOI: https://doi.org/10.48550/arXiv.2304.03995



8. Prize Money Breakdown

If any prize money is granted, it should be awarded entirely to the first author of both articles (Robert Tjarko Lange). This decision was made in mutual agreement between all authors.


9. A Statement Indicating Why this Entry Could Be the "Best"

While many recent papers have used manually designed evolutionary algorithms to reach strong results on specific Machine Learning tasks, our proposed meta-black-box perspective aims to discover Evolutionary Optimizers themselves from scratch. As such, we believe our work represents an ambitious but feasible perspective for advancing evolutionary optimization going forward. More specifically, our work is an example of leveraging recent advances in the Machine Learning community on meta-learning and set-based inductive bias such as self-attention for advancing evolutionary computation. Furthermore, the meta-optimized EO algorithms can in turn be used to optimize Machine Learning systems and potentially discover new Machine Learning algorithms. This interdisciplinary cross-fertilization is rarely seen throughout the research community and will hopefully inspire many new directions. We are convinced that as the amount of available computational resources increases over time, this approach can scale to provide more and more powerful general-purpose EO.


10. Evolutionary Computation Type

Evolution Strategies (ES) and Genetic Algorithms (GA)


11. Publication Date

Paper A: The publication has been accepted on the 20th of January 2023 and presented at ICLR 2023. As proof, the decision and rebuttal discussion can be found on the OpenReview website: https://openreview.net/forum?id=mFDU0fP3EQH

Paper B: The publication has been unconditionally accepted at GECCO 2023. Furthermore, it has been nominated for the Best Paper Award in the GA Track. As proof, the list of accepted papers can already be found on the GECCO website: https://gecco-2023.sigevo.org/Accepted-Papers

The code and a set of meta-evolved network weights for both the Learned Evolution Strategy and Learned Genetic Algorithm can be found in the open-source library evosax (see https://github.com/RobertTLange/evosax).