The AAAI-20 outstanding paper awards were presented on Tuesday 11th February at the AAAI conference in New York. Awards and honourable mentions were given for: outstanding paper, outstanding student paper and outstanding paper in the special track on AI for social impact. You can read about the award-winning work below.
WINOGRANDE: An Adversarial Winograd Schema Challenge at Scale
Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
The Winograd Schema Challenge (WSC) (Levesque, Davis, and Morgenstern 2011), a benchmark for commonsense reasoning, is a set of 273 expert-crafted pronoun resolution problems originally designed to be unsolvable for statistical models that rely on selectional preferences or word associations. However, recent advances in neural language models have already reached around 90% accuracy on variants of WSC. This raises an important question whether these models have truly acquired robust commonsense capabilities or whether they rely on spurious biases in the datasets that lead to an overestimation of the true capabilities of machine commonsense. To investigate this question, we introduce WinoGrande, a large-scale dataset of 44k problems, inspired by the original WSC design, but adjusted to improve both the scale and the hardness of the dataset. The key steps of the dataset construction consist of (1) a carefully designed crowdsourcing procedure, followed by (2) systematic bias reduction using a novel AfLite algorithm that generalizes human-detectable word associations to machine-detectable embedding associations. The best state-of-the-art methods on WinoGrande achieve 59.4-79.1%, which are 15-35% below human performance of 94.0%, depending on the amount of the training data allowed. Furthermore, we establish new state-of-the-art results on five related benchmarks – WSC (90.1%), DPR (93.1%), COPA (90.6%), KnowRef (85.6%), and Winogender (97.1%). These results have dual implications: on one hand, they demonstrate the effectiveness of WinoGrande when used as a resource for transfer learning. On the other hand, they raise a concern that we are likely to be overestimating the true capabilities of machine commonsense across all these benchmarks. We emphasize the importance of algorithmic bias reduction in existing and future benchmarks to mitigate such overestimation.
Read the full paper on arXiv.
A Unifying View on Individual Bounds and Heuristic Inaccuracies in Bidirectional Search
Vidal Alcázar, Pat Riddle, Mike Barley
In the past few years, new very successful bidirectional heuristic search algorithms have been proposed. Their key novelty is a lower bound on the cost of a solution that includes information from the g values in both directions. Kaindl and Kainz (1997) proposed measuring how inaccurate a heuristic is while expanding nodes in the opposite direction, and using this information to raise the f value of the evaluated nodes. However, this comes with a set of disadvantages and remains yet to be exploited to its full potential. Additionally, Sadhukhan (2013) presented BAE∗, a bidirectional best-first search algorithm based on the accumulated heuristic inaccuracy along a path. However, no complete comparison in regards to other bidirectional algorithms has yet been done, neither theoretical nor empirical. In this paper we define individual bounds within the lower-bound framework and show how both Kaindl and Kainz’s and Sadhukhan’s methods can be generalized thus creating new bounds. This overcomes previous shortcomings and allows newer algorithms to benefit from these techniques as well. Experimental results show a substantial improvement, up to an order of magnitude in the number of necessarily-expanded nodes compared to state-of-the-art near-optimal algorithms in common benchmarks.
Fair Division of Mixed Divisible and Indivisible Goods
Xiaohui Bei, Zihao Li, Jinyan Liu, Shengxin Liu, Xinhang Lu
We study the problem of fair division when the resources contain both divisible and indivisible goods. Classic fairness notions such as envy-freeness (EF) and envy-freeness up to one good (EF1) cannot be directly applied to the mixed goods setting. In this work, we propose a new fairness notion envy-freeness for mixed goods (EFM), which is a direct generalization of both EF and EF1 to the mixed goods setting. We prove that an EFM allocation always exists for any number of agents. We also propose efficient algorithms to compute an EFM allocation for two agents and for agents with piecewise linear valuations over the divisible goods. Finally, we relax the envy-free requirement, instead asking for -envy-freeness for mixed goods (-EFM), and present an algorithm that finds an -EFM allocation in time polynomial in the number of agents, the number of indivisible goods, and .
Read the full paper on arXiv.
Lifelong Learning with a Changing Action Set
Yash Chandak, Georgios Theocharous, Chris Nota, Philip S. Thomas
In many real-world sequential decision making problems, the number of available actions (decisions) can vary over time. While problems like catastrophic forgetting, changing transition dynamics, changing rewards functions, etc. have been well-studied in the lifelong learning literature, the setting where the action set changes remains unaddressed. In this paper, we present an algorithm that autonomously adapts to an action set whose size changes over time. To tackle this open problem, we break it into two problems that can be solved iteratively: inferring the underlying, unknown, structure in the space of actions and optimizing a policy that leverages this structure. We demonstrate the efficiency of this approach on large-scale real-world lifelong learning problems.
Read the full paper on arXiv.
A Distributed Multi-Sensor Machine Learning Approach to Earthquake Early Warning
Kévin Fauvel, Daniel Balouek-Thomert, Diego Melgar, Pedro Silva, Anthony Simonet, Gabriel Antoniu, Alexandru Costan, Véronique Masson, Manish Parashar, Ivan Rodero, Alexandre Termier
Our research aims to improve the accuracy of Earthquake Early Warning (EEW) systems by means of machine learning. EEW systems are designed to detect and characterize medium and large earthquakes before their damaging effects reach a certain location. Traditional EEW methods based on seismometers fail to accurately identify large earthquakes due to their sensitivity to the ground motion velocity. The recently introduced high-precision GPS stations, on the other hand, are ineffective to identify medium earthquakes due to its propensity to produce noisy data. In addition, GPS stations and seismometers may be deployed in large numbers across different locations and may produce a significant volume of data consequently, affecting the response time and the robustness of EEW systems. In practice, EEW can be seen as a typical classification problem in the machine learning field: multi-sensor data are given in input, and earthquake severity is the classification result. In this paper, we introduce the Distributed Multi-Sensor Earthquake Early Warning (DMSEEW) system, a novel machine learning-based approach that combines data from both types of sensors (GPS stations and seismometers) to detect medium and large earthquakes. DMSEEW is based on a new stacking ensemble method which has been evaluated on a real-world dataset validated with geoscientists. The system builds on a geographically distributed infrastructure, ensuring an efficient computation in terms of response time and robustness to partial infrastructure failures. Our experiments show that DMSEEW is more accurate than the traditional seismometer-only approach and the combined-sensors (GPS and seis-mometers) approach that adopts the rule of relative strength.
Read the full paper here.
The Unreasonable Effectiveness of Inverse Reinforcement Learning in Advancing Cancer Research
John Kalantari, Heidi Nelson and Nicholas Chia
The “No Free Lunch” theorem states that for any algorithm, elevated performance over one class of problems is offset by its performance over another. Stated differently, no algorithm works for everything. Instead, designing effective algorithms often means exploiting prior knowledge of data relationships speciﬁc to a given problem. This “unreasonable efﬁcacy” is especially desirable for complex and seemingly intractable problems in the natural sciences. One such area that is rife with the need for better algorithms is cancer biology—a ﬁeld where relatively few insights are being generated from relatively large amounts of data. In part, this is due to the inability of mere statistics to reﬂect cancer as a genetic evolutionary process—one that involves cells actively mutating in order to navigate host barriers, out-compete neighboring cells, and expand spatially. Our work is built upon the central proposition that the Markov Decision Process (MDP) can better represent the process by which cancer arises and progresses. More speciﬁcally, by encoding a cancer cell’s complex behavior as a MDP, we seek to model the series of genetic changes, or evolutionary trajectory, that leads to cancer as an optimal decision process. We posit that using an Inverse Reinforcement Learning(IRL) approach will enable us to reverse engineer an optimal policy and reward function based on a set of expert demonstrations extracted from the DNA of patient tumors. The inferred reward function and optimal policy can subsequently be used to extrapolate the evolutionary trajectory of any tumor. Here,we introduce a Bayesian nonparametric IRL model (PUR-IRL) where the number of reward functions is a priori unbounded in order to account for uncertainty in cancer data, i.e., the existence of latent trajectories and non-uniform sampling. We show that PUR-IRL is “unreasonably effective” in gaining interpretable and intuitive insights about cancer progression from high-dimensional genome data.