Have you ever watched a mystery movie and noticed a critical link before the characters did? Maybe this critical link even helped you solve the mystery before the end of the story. If so, think about how you made that connection: did you notice something that “just didn’t add up?” Maybe, instead, one character’s behavior seemed too suspicious from the beginning. Whether you made a logical deduction or made a prediction based on behavioral patterns, you performed link prediction, a common task researchers use on computational graph structures.
In our recent survey paper, we focus specifically on methods which operate on graph structures. Graphs can be visualized as a series of nodes, which represent objects or entities, connected by edges, which represent the relationships between them. To better understand, let’s use an analogy: imagine that a crime was committed, and an investigator is trying to determine who was responsible. To solve this mystery, the investigator might create one of those photo-covered corkboards that we see in the movies, such as the one shown below. Essentially, each item is a suspect or a piece of evidence, and the investigator connects the items using strings to represent known interactions or relationships between them.
Image by macrovector on Freepik
The corkboard is like a graph structure. In this case, the nodes are the items, and the strings connecting the photos are edges. While the investigator knows about some of the connections between suspects and pieces of evidence, she wants to figure out who or what else is likely connected. Oftentimes, we do not know all of the edges in a graph, and we need a method to predict whether an edge, or a link, exists between two nodes. This is called link prediction.
There are various ways to accomplish this. Some methods use logic: for example, if Suspect A claimed to be with Suspect B the evening of the crime, and Suspect B met with Suspect C, then Suspect A and Suspect C are also likely to have met that evening. We call those methods symbolic. Other methods use a machine learning algorithm to discover patterns across subjects based on previous cases. Based on those patterns, these methods, which we call neural or deep learning methods, make predictions as to how likely certain nodes are connected. However, we see a tradeoff between these categories of link prediction. While the former category of methods is completely interpretable to a human, it does not do very well on large datasets. In contrast, the predictions of the latter category are not easily interpretable, but the methods scale and perform well on large graphs.
We studied novel approaches which aim to combine the complementary features of these two categories, called neurosymbolic.
Imagine two people are asked to look at the corkboard: a detective and a scientist. The detective is best at discovering patterns and interrogating suspects for details, such as lie indicators and suspicious behavior, so he is more like our neural approaches. The scientist is more knowledgeable about forensic science, and she thinks very logically. She will represent our symbolic methods. There are many different ways these two could work together to find new connections on the corkboard.
The main contribution of our paper is a taxonomy of neurosymbolic approaches for reasoning on graph structures. We classify the methods into three categories:
Each category is further divided into subcategories, with 10 in total. Every subcategory is also related to the more general classification of neurosymbolic AI by Kautz. We have analyzed and categorized 34 different tools in total and collected all the publicly available code which can be found on our Github.
Our survey also covers prospective directions in the field. We identify some underexplored application areas and potential improvements to current approaches. These include working end-to-end, integrating multi-modal data, adding conditional edge types, reasoning about spatiotemporal features or adapting the tools to work for few shot learning problems.
Reasoning about graphs using neurosymbolic approaches is a relatively young area of research; in fact, the tools we surveyed were only developed between 2015 and 2022. However, it has shown great potential and it is applicable to many interesting problems. By making the landscape of approaches clearer, we hope to make the field more accessible. We encourage researchers to find more and better ways so that detectives and scientists can work together.