The 27th European Conference on Artificial Intelligence (ECAI-2024) took place from 19-24 October in Santiago de Compostela, Spain. The venue also played host to the 13th Conference on Prestigious Applications of Intelligent Systems (PAIS-2024). During the week, both conferences announced their outstanding paper award winners.
The winning articles were chosen based on the reviews written during the paper selection process, nominations submitted by individual members of the programme committee, additional input solicited from outside experts, and the judgement of the programme committee chairs.
Improving Calibration by Relating Focal Loss, Temperature Scaling, and Properness
Viacheslav Komisarenko and Meelis Kull
Abstract: Proper losses such as cross-entropy incentivize classifiers to produce class probabilities that are well-calibrated on the training data. Due to the generalization gap, these classifiers tend to become overconfident on the test data, mandating calibration methods such as temperature scaling. The focal loss is not proper, but training with it has been shown to often result in classifiers that are better calibrated on test data. Our first contribution is a simple explanation about why focal loss training often leads to better calibration than cross-entropy training. For this, we prove that focal loss can be decomposed into a confidence-raising transformation and a proper loss. This is why focal loss pushes the model to provide under-confident predictions on the training data, resulting in being better calibrated on the test data, due to the generalization gap. Secondly, we reveal a strong connection between temperature scaling and focal loss through its confidence-raising transformation, which we refer to as the focal calibration map. Thirdly, we propose focal temperature scaling – a new post-hoc calibration method combining focal calibration and temperature scaling. Our experiments on three image classification datasets demonstrate that focal temperature scaling outperforms standard temperature scaling.
Read the paper in full here.
Scale-Adaptive Balancing of Exploration and Exploitation in Classical Planning
Stephen Wissow and Masataro Asai
Abstract: Balancing exploration and exploitation has been an important problem in both adversarial games and automated planning. While it has been extensively analyzed in the Multi-Armed Bandit (MAB) literature, and the game community has achieved great success with MAB-based Monte Carlo Tree Search (MCTS) methods, the planning community has struggled to advance in this area. We describe how Upper Confidence Bound 1’s (UCB1’s) assumption of reward distributions with known bounded support shared among siblings (arms) is violated when MCTS/Trial-based Heuristic Tree Search (THTS) in previous work uses heuristic values of search nodes in classical planning problems as rewards. To address this issue, we propose a new Gaussian bandit, UCB1-Normal2, and analyze its regret bound. It is variance-aware like UCB1-Normal and UCB-V, but has a distinct advantage: it neither shares UCB-V’s assumption of known bounded support nor relies on UCB1-Normal’s conjectures on Student’s t and χ2 distributions. Our theoretical analysis predicts that UCB1-Normal2 will perform well when the estimated variance is accurate, which can be expected in deterministic, discrete, finite state-space search, as in classical planning. Our empirical evaluation confirms that MCTS combined with UCB1-Normal2 outperforms Greedy Best First Search (traditional baseline) as well as MCTS with other bandits.
Read the paper in full here.
FairCognizer: A Model for Accurate Predictions with Inherent Fairness Evaluation
Adda-Akram Bendoukha, Nesrine Kaaniche, Aymen Boudguiga and Renaud Sirdey
Abstract: Algorithmic fairness is a critical challenge in building trustworthy Machine Learning (ML) models. ML classifiers strive to make predictions that closely match real-world observations (ground truth). However, if the ground truth data itself reflects biases against certain sub-populations, a dilemma arises: prioritize fairness and potentially reduce accuracy, or emphasize accuracy at the expense of fairness. This work proposes a novel training framework that goes beyond achieving high accuracy. Our framework trains a classifier to not only deliver optimal predictions but also to identify potential fairness risks associated with each prediction. To do so, we specify a dual-labeling strategy where the second label contains a per-prediction fairness evaluation, referred to as an unfairness risk evaluation. In addition, we identify a subset of samples as highly vulnerable to group-unfair classifiers. Our experiments demonstrate that our classifiers attain optimal accuracy levels on both the Adult-Census-Income and Compas-Recidivism datasets. Moreover, they identify unfair predictions with nearly 75% accuracy at the cost of expanding the size of the classifier by a mere 45%.
Read the paper in full here.
More (Enough) Is Better: Towards Few-Shot Illegal Landfill Waste Segmentation
Matias Molina, Carlos Ferreira, Bruno Veloso, Rita P. Ribeiro and João Gama
Abstract: Image segmentation for detecting illegal landfill waste in aerial images is essential for environmental crime monitoring. Despite advancements in segmentation models, the primary challenge in this domain is the lack of annotated data due to the unknown locations of illegal waste disposals. This work mainly focuses on evaluating segmentation models for identifying individual illegal landfill waste segments using limited annotations. This research seeks to lay the groundwork for a comprehensive model evaluation to contribute to environmental crime monitoring and sustainability efforts by proposing to harness the combination of agnostic segmentation and supervised classification approaches. We mainly explore different metrics and combinations to better understand how to measure the quality of this applied segmentation problem.
Read the paper in full here.
You can find the conference proceedings here.