Learning and Optimization in Combinatorial Spaces
Abstract
In this thesis, we develop methods for using machine learning to solve combinatorial optimization problems, with a focus on vehicle routing problems. The thesis consists of two parts. In Part i (Chapters 3 and 4), we develop practical methods using machine learning models to solve different variants of vehicle routing problems. As these models represent probability distributions over combinatorial spaces, in Part ii (Chapters 5 and 6), we focus on sampling from such models and optimizing their parameters.
Specifically, in Chapter 3, we use reinforcement learning to train the attention model, which represents a construction heuristic, to solve different variants of routing problems. In Chapter 4, we present deep policy dynamic programming, which uses another learned model to guide a restricted dynamic programming algorithm, for improved performance on routing problems and the ability to handle complex constraints such as time windows.
Given the deterministic nature of combinatorial problems, duplicate samples from the models in Part i are uninformative, so Part ii focuses on sampling without replacement from such models. In Chapter 5, we present ancestral Gumbel-top-k sampling as an efficient method for drawing samples without replacement from structured models over combinatorial domains, and we illustrate the general applicability beyond routing problems. In Chapter 6, we derive statistical gradient estimators based on such samples without replacement, which can be used to improve the gradient-based training procedure for the model in Chapter 3.