There were three invited talks at this year’s virtual ICML. The first was given by Lester Mackey, and he highlighted some of his efforts to do some good with machine learning. During the talk he also outlined several ways in which social good efforts can be organised, and described numerous social good problems that would benefit from the community’s attention.
Lester took the audience on a journey from his grad school days to the present, focusing on the social good projects he’s been involved in along the way. His research in this area has included efforts to combat nuclear proliferation, climate forecasting, and COVID-19 work.
Alongside his research Lester has also taken part in various machine learning for good challenges. He told us about the Prize4Life competition where he and Lilly Fang were awarded first prize for their work. Prize4Life is a non-profit organisation whose mission is to accelerate the discovery of treatments and a cure for ALS (Amyotrophic Lateral Sclerosis, also known as Lou Gehrig’s disease). The goal of this particular competition was to predict the rate of disease progression in ALS patients. Given data for patient progression for three months teams had to predict what would happen over the subsequent nine months. Lester and Lilly’s model placed first and outperformed 12 clinicians. Prize4Life estimated that their predictions would reduce drug trial sizes by 20%.
While at Stanford University, Lester set up the Statistics for Social Good working group. This was inspired in no small way by the Data Science for Social Good summer fellowship programme, then organised by the University of Chicago (it has since moved to Carnegie Mellon University). The group met regularly and the aim was to find social good organisations they could work with; they would determine if there was a problem they could help solve using their machine learning and data science skills. Lester described a few of the partners, and associated projects, that the Statistics for Social Good group worked on:
SparkPoint is an anti-poverty organization that helps low-income individuals and families reach financial stability. Clients work together with SparkPoint coaches to design a personalized path towards their financial goals. The Statistics for Social Good group used a combination of data visualization and regression modelling to produce a model which predicted the services SparkPoint should offer to its clients, and in which order, to be most effective.
Working with Healthcare Denmark and various Stanford organisations the team used patient data from Denmark from 2004-2011 to predict which individuals would need healthcare interventions. These are people who, without intervention, could incur the greatest future healthcare costs. Their model achieved a 30% improvement in cost capture over standard forecasts.
The Global Oncology Initiative is a volunteer organization dedicated to helping improve cancer care worldwide. They asked Statistics for Social Good to find, consolidate, and visualize opiate consumption data in an Opioid Atlas for palliative care researchers. Their work on this project was showcased at the 2016 Global Cancer Research Symposium.
GreatNonprofits is an online review platform for nonprofits. They help donors, volunteers, and clients learn about organizations matching their interests. The goal for Statistics for Social Good was to study the relationship between reviewer background and review content. In particular, they were tasked with investigating the influence of “courtesy bias” on reviews. That is, do those who depend on non-profit services to meet their basic needs avoid writing negative reviews? Interestingly, the team’s analysis of feedback revealed the opposite effect: for example those who relied on food banks and shelters were more likely to leave harsh reviews about those services.
You can find out more about the group’s projects on their GitHub page.
The group also collaborated with other Social Good teams. These included:
DS 4 Social Good. A Google groups forum started by Patrick Meier, and open to the public.
Stanford Data Lab. Stanford courses designed to teach students data science skills in the context of tackling real social challenge, run by Bill Behrman.
Data-Pop Alliance, a collaborative laboratory created by the Harvard Humanitarian Initiative, MIT Connection Science, and Overseas Development Institute.
Berkeley D-Lab. Intelligent research design for data intensive social science.
To end his talk, Lester proposed four challenges for the machine learning community:
1) Teach more machine learning for good
2) Publish more machine learning for good
3) Incentivize more machine learning for good
4) Prioritise more machine learning for good
Lester no doubt inspired many researchers to concentrate their efforts in this area. For those interested, he provided a list of resources and challenges for those looking to do some good with machine learning. You can find these in the slides for the talk and we’ve also listed the links below.
Statistics Without Borders
Solve for Good
Carnegie Mellon University
University of Washington
The Alan Turing Institute
You can access the slides for Lester’s talk here.
Lester Mackey is a machine learning researcher at Microsoft Research, where he develops new tools, models, and theory for large-scale learning tasks. He applies these to areas such as healthcare, climate, recommender systems, and social good. Lester moved to Microsoft from Stanford University, where he was an assistant professor of Statistics and (by courtesy) of Computer Science. He earned his PhD in Computer Science and MA in Statistics from UC Berkeley and his BSE in Computer Science from Princeton University. He co-organized the second place team in the Netflix Prize competition for collaborative filtering, won the Prize4Life ALS disease progression prediction challenge, won prizes for temperature and precipitation forecasting in the year-long real-time Subseasonal Climate Forecast Rodeo, and received a best student paper award at the International Conference on Machine Learning.