Explaining Predictions from Machine Learning Models: Algorithms, Users, and Pedagogy
Abstract
Model explainability has become an important problem in artificial intelligence (AI) due to the increased effect that algorithmic predictions have on humans. Explanations can help users understand not only why AI models make certain predictions, but also how these predictions can be changed. In the first part of this thesis, we investigate counterfactual explanations: given a data point and a trained model, we want to find the minimal perturbation to the input such that the prediction changes. We frame the problem of finding counterfactual explanations as a gradient-based optimization task and first focus on tree ensembles. We then extend our method for generating counterfactual explanations for tree ensembles to accommodate graph neural networks (GNNs), given the increasing promise of GNNs in real-world applications such as fake news detection and molecular simulation.
In the second part of this thesis, we investigate explanations in the context of a real-world use case: sales forecasting. We propose an algorithm that generates explanations for large errors in forecasting predictions based on Monte Carlo simulations. To evaluate, we conduct a user study with 75 users and find that the majority of users are able to accurately answer objective questions about the model's predictions when provided with our explanations, and that users who saw our explanations understand why the model makes large errors in predictions significantly more than users in the control group.
In the final part of the thesis, we explain the setup for a technical, graduate-level course on responsible AI topics at the University of Amsterdam, which teaches responsible AI concepts through the lens of reproducibility. The focal point of the course is a group project based on reproducing existing responsible AI algorithms from top AI conferences and writing a corresponding report. We reflect on our experiences teaching the course over two years and propose guidelines for incorporating reproducibility in graduate-level AI study programs.