Dynamic algorithm configuration for interactive learning

Forum 'Stages' - Sujet créé le 06/12/2018 par Thibaut Lust (1370 vues)

Le 06/12/2018 par Thibaut Lust :

Encadrants :

Carola Doerr (CNRS Researcher, LIP6, Sorbonne Université), http://www-ia.lip6.fr/~doerr/, carola.doerr@mpi-inf.mpg.de
Thibaut Lust (Associate Professor, LIP6, Sorbonne Université), http://www-desir.lip6.fr/~lustt/, thibaut.lust@lip6.fr

Lieu : Sorbonne Université, LIP6, 4 place Jussieu, 75005 Paris.

Sujet : The goal of this internship is to study the adaptation of Pareto local search to solve multi-objective combinatorial problems with at least three objectives to optimize. In order to reduce the number of Pareto optimal solutions, the decision-maker's preferences will be learned interactively during the progression of the algorithm.
Our key objective is to study the benefits of changing, dynamically during the optimization process, the parameters that dictate the search behavior of the algorithms (e.g., neighborhood sizes in local search heuristics) and those parameters related to the preference model to learn. To this end we will compare different parameter control techniques such as dynamic random choices, reinforcement learning and multi-armed bandit approaches. Our benchmark problems comprise classic combinatorial optimization problems like the traveling salesman problem but also more complex problems with more general preference models will be considered.

Résultats attendus :

Solving real applications is the main motivation behind this project. For example, there are currently no effective algorithms to solve complex vehicle routing problems (in this transportation problem a set of vehicles leave a depot and must make a tour to visit different clients), which makes it possible to optimize the various criteria (cost, time, pollution emitted, distance, etc.) and to adapt to the complex preferences of the various decision-makers (vehicle drivers and managers).

With this project, we plan to publish new theoretical and practical results for solving such problems, more precisely:

New algorithms for solving multi-objective combinatorial optimization problems by interactively learning the decision maker's preferences
Interactive version of PLS with a dominance relation based on preferences allowing to solve high size instances with high number of objectives, in low computational times
Automatic algorithm configuration
Application to real problems of large size (traveling salesman problem, vehicles routing, etc.)

At the end of this project, we therefore expect to develop new algorithms to effectively solve multi-objective combinatorial optimization problems directly from real themes, such as optimization problems encountered in transport or logistics, often presenting different contradictory criteria.

Moteur de recherche

Dynamic algorithm configuration for interactive learning

Tous les forums