Mitigating biases in machine learning

17 May 2023

share this:
example of clustering with blue and red circles

Machine Learning (ML) is increasingly being used to simplify and automate a number of important computational tasks in modern society. From the disbursement of bank loans to job application screenings, these computer systems streamline several processes that have a considerable impact on our day to day lives. However, these artificially intelligent systems are most often devised to emulate human decision making — an inherently biased framework. For example, Microsoft’s Tay online chatbot quickly learned to tweet using racial slurs as a result of the biased online input stream (Caton and Haas 2020), and the COMPAS tool often flagged black individuals as more likely to commit a crime (even if two individuals were statistically similar with respect to many other attributes) (Flores, Bechtel and Lowenkamp 2016).

Crucially, these issues are not the product of a malevolent computer programmer instilling radical beliefs, but rather a byproduct of machines learning to optimize for a particular objective, which can inadvertently leverage underlying biases present in the data. These biases are most often the result of two (not mutually exclusive) pervasive issues in data science: bad training data and objectives that are blind to demographic (or comparable sensitive information) considerations. The former is nicely depicted again by the Tay chatbot: training on text from anonymous online forums is unfortunately not a good representation of how the general population talks. Additionally, when models are trained to optimize for an objective that ignores demographic considerations, discriminatory practices can inadvertently be learned, as was the case for the COMPAS algorithm. The algorithm was trained to predict future criminal behavior but inadvertently learned to weigh an individual’s race heavily in its decision-making process.

To provide a clearer understanding of how these “blind” objectives work, we here restrict our attention to the process of clustering data. This involves grouping objects so that those in the same cluster are more like each other than those in other clusters. Data scientists use cluster analysis to garner insight from data by merely observing which cluster a point belongs to and assuming it shares the typical properties of that group. For instance, if we represent a patient’s medical data as a set of relevant numbers such as height, weight, and age, we can visualize the data as points in a three-dimensional space. We can then use standard methods to cluster the data by identifying data points that are close together in this space. To do this, we minimize an “objective function” that measures this closeness. However, this process may inadvertently segregate men from women, as these groups are likely to be similar in terms of height and weight. This is a toy example of the inherent biases that clustering algorithms learn which can have a significant impact on the treatment that a physician provides if not recognized.

While under-representative data has a (somewhat) clean solution in improved collection procedures, mitigating bias as a result of objective minimization is less clear. In an effort to counteract this issue, researchers have recently turned to propose algorithms that ensure some quantified form of “fair representation” at the cost of a degradation in the optimality of a clustering. Consider the following figure that depicts a traditional clustering result versus what one would deem “fair”, where the two colors are representative of some protected attribute such as gender or race. We see that the standard (blind) solution may yield clusters of data points that disproportionately segregate the data based on an implicit attribute, leading to potentially discriminatory or even illegal behavior by the algorithm. A more desirable solution in practice might yield clusters that contain equal representation of the two protected attributes while also mitigating the intra-cluster distances between data points.

example of clustering with blue and red circles

In essence, balancing solutions that minimize our objective with those that yield equal representation is the crux of modern fair clustering work. While in the past few years a considerable number of studies have examined fair flat clustering (partition data into disjoint clusters), the major problem of hierarchical clustering has been largely neglected. Like the flat variant depicted above, hierarchical starts with a collection of data points whose similarity is measured as the distance between points — the closer two points are, the more similar. Now to build a hierarchy, we first let each point be its own cluster and subsequently merge these with neighboring clusters up to a certain size. We can then repeat this process on our enlarged groupings until we are left with one that encompasses all our data. A major advantage here is that we do not need to impose a restriction on the number of clusters on the data, rather we try to group things at a wide variety of granularities. For this reason, hierarchical clustering is common in the division of a geographic region or for phylogenetic trees.

In our recent work (Knittel et al. 2023), we examine the notion of fairness in hierarchical clustering due to its pervasive utility in data science. Specifically, we demonstrate that a relatively intuitive algorithm that strategically folds clusters onto one another according to their protected attribute representations can produce a fair clustering with only a modest decrease in the optimality of the solution (as measured by the objective function). Such results are highly nontrivial as well as vital to demonstrating the utility of fairness constraints.

Although technological innovation is accelerating daily, it is important to be cognizant of, and mitigate, potentially harmful algorithmic practices that can perpetuate biases. We hope that our work serves as further proof of concept that incorporating fairness constraints or demographic information into the optimization process can reduce biases in ML models without significantly sacrificing performance – a result that is critical where automated decision-making can have a harmful impact.


Max Springer is a PhD candidate in Applied Mathematics at the University of Maryland.
Max Springer is a PhD candidate in Applied Mathematics at the University of Maryland.

            AIhub is supported by:

Related posts :

Interview with Mike Lee: Communicating AI decision-making through demonstrations

We hear from AAAI/SIGAI Doctoral Consortium participant Mike Lee about his research on explainable AI.
23 April 2024, by

Machine learning viability modelling of vertical-axis wind turbines

Researchers have used a genetic learning algorithm to identify optimal pitch profiles for the turbine blades.
22 April 2024, by

The Machine Ethics podcast: What is AI? Volume 3

This is a bonus episode looking back over answers to our question: What is AI?
19 April 2024, by

DataLike: Interview with Tẹjúmádé Àfọ̀njá

"I place an emphasis on wellness and meticulously plan my schedule to ensure I can make meaningful contributions to what's important to me."

Beyond the mud: Datasets, benchmarks, and methods for computer vision in off-road racing

Off-road motorcycle racing poses unique challenges that push the boundaries of what existing computer vision systems can handle
17 April 2024, by

Interview with Bálint Gyevnár: Creating explanations for AI-based decision-making systems

PhD student and AAAI/SIGAI Doctoral Consortium participant tells us about his research.
16 April 2024, by

AIhub is supported by:

©2024 - Association for the Understanding of Artificial Intelligence


©2021 - ROBOTS Association