AI Policy Matters – AI data, facial recognition, and more
By Larry Medsker
Confusion in the popular media about terms such as algorithm and what constitutes AI technology cause critical misunderstandings among the public and policymakers. More importantly, the role of data is often ignored in ethical and operational considerations. Even if AI systems are perfectly built, low quality and biased data cause unintentional and even intentional hazards.
Language models and data
A generative pre-trained transformer GPT-3 is currently in the news. For example, James Vincent in the July 30, 2020, article in The Verge writes about GPT-3, which was created by OpenAI. Language models, GPT-3 the current ultimate product, have ethics issues on steroids for products being made. Inputs to the system have all the liabilities discussed about Machine Learning and Artificial Neural Network products. The dangers of bias and mistakes are raised in some writings but are likely not a focus among the wide range of enthusiastic product developers using the open-source GPT-3. Language models suggest output sequences of words given an input sequence. Thus, samples of text from social media can be used to produce new text in the same style as the author and potentially can be used to influence public opinion. Cases have been found of promulgating incorrect grammar and misuse of terms based on poor quality inputs to language models. An article by David Pereira includes examples and comments on the use of GPT-3. The article “GPT-3: an AI Game-Changer or an Environmental Disaster?” by John Naughton gives examples of and commentary on results from GPT-3.
A possible meta solution for policymakers to keep up with technological advances and AI data issues is discussed by Alex Woodie in “AI Ethics and Data Governance: A Virtuous Cycle.” He quotes James Cotton, who is the international director of the Data Management Centre of Excellence at Information Builders’ Amsterdam office: “as powerful as the AI technology is, it can’t be implemented in an ethical manner if the underlying data is poorly managed and badly governed. It’s critical to understand the relationship between data governance and AI ethics. One is foundational for the other. You can’t preach being ethical or using data in an ethical way if you don’t know what you have, where it came from, how it’s being used, or what it’s being used for.”
AI and facial recognition
Concerns about facial recognition: discrimination, privacy, and democratic freedom
While including ethical and moral issues, a broader list of issues is concerning to citizens and policymakers about face recognition technology and AI. Areas of concerns include accuracy; surveillance; data storage, permissions, and access; discrimination, fairness, and bias; privacy and video recording without consent; democratic freedoms, including right to choose, gather, and speak; and abuse of technology such as non-intended uses, hacking, and deep fakes. Used responsibly and ethically, face recognition can be valuable for finding missing people, responsible policing and law enforcement, medical uses, healthcare, virus tracking, legal system and court uses, and advertising. Various guidelines by organizations such as the AMA and legislation like S.3284 – Ethical Use of Facial Recognition Act are being developed to encourage the proper use of AI and face recognition. Some of the above issues do specifically require ethical analysis as in the following by Yaroslav Kuflinski:
- Accuracy — FR systems naturally discriminate against non-white people, women, and children, presenting errors of up to 35% for non-white women.
- Surveillance issues — concerns about “big brother” watching society.
- Data storage — use of images for future purposes stored alongside genuine criminals.
- Finding missing people — breaches of the right to a private life.
- Advertising — invasion of privacy by displaying information and preferences that a buyer would prefer to keep secret.
- Studies of commercial systems are increasingly available, for example an analysis of Amazon Rekognition.
- Biases deriving from sources of unfairness and discrimination in machine learning have been identified in two areas: the data and the algorithms. Biases in data skew what is learned in machine learning methods, and flaws in algorithms can lead to unfair decisions even when the data is unbiased. Intentional or unintentional biases can exist in the data used to train FR systems.
- New human-centered design approaches seek to provide intentional system development steps and processes in collecting data and creating high quality databases, including the elimination of naturally occurring bias reflected in data about real people.
Bias that pertains especially to facial recognition (Mehrabi, et al. and Barocas et al.)
- Direct discrimination: “Direct discrimination happens when protected attributes of individuals explicitly result in non-favorable outcomes toward them”. Some traits like race, color, national origin, religion, sex, family status, disability, exercised rights under CCPA, marital status, receipt of public assistance, and age are identified as sensitive attributes or protected attributes in the machine learning world.
- Indirect discrimination: Even if sensitive or protected attributes are not used against an individual, indirect discrimination can still happen. For example, residential zip code is not categorized as a protected attribute, but from the zip code one might infer race, which is a protected attribute. So, “protected groups or individuals still can get treated unjustly as a result of implicit effects from their protected attributes”.
- Systemic discrimination: “policies, customs, or behaviors that are a part of the culture or structure of an organization that may perpetuate discrimination against certain subgroups of the population”.
- Statistical discrimination: In law enforcement, racial profiling is an example of statistical discrimination. In this case, minority drivers are pulled over more than compared to white drivers — “statistical discrimination is a phenomenon where decision-makers use average group statistics to judge an individual belonging to that group.”
- Explainable discrimination: In some cases, discrimination can be explained using attributes like working hours and education, which is legal and acceptable. In “the UCI Adult dataset, a widely-used dataset in the fairness domain, males on average have a higher annual income than females; however, this is because, on average, females work fewer hours than males per week. Work hours per week is an attribute that can be used to explain low income. If we make decisions without considering working hours such that males and females end up averaging the same income, we could lead to reverse discrimination since we would cause male employees to get lower salary than females.
- Unexplainable discrimination: This type of discrimination is not legal as explainable discrimination because “the discrimination toward a group is unjustified”.
How to discuss facial recognition
Recent controversies about FR mix technology issues with ethical imperatives and ignore that people can disagree on which are the “correct” ethical principles. A recent ACM tweet on FR and face masks was interpreted in different ways and ACM issued an official clarification. A question that emerges is if AI and other technologies should be, and can be, banned rather than controlled and regulated. In early June, 2020, IBM CEO Arvind Krishna said in a letter to Congress that IBM is exiting the facial recognition business and asking for reforms to combat racism: “IBM no longer offers general purpose IBM facial recognition or analysis software. IBM firmly opposes and will not condone uses of any technology, including facial recognition technology offered by other vendors, for mass surveillance, racial profiling, violations of basic human rights and freedoms, or any purpose which is not consistent with our values and Principles of Trust and Transparency,” Krishna said in his letter to members of congress, “We believe now is the time to begin a national dialogue on whether and how facial recognition technology should be employed by domestic law enforcement agencies.”
Policy and AI ethics
The Alan Turing Institute Public Policy Programme
Among the complexities of public policy making, the new world of AI and data science requires careful consideration of ethics and safety in addressing complex and far-reaching challenges in the public domain. Data and AI systems lead to opportunities that can produce both good and bad outcomes. Ethical and safe systems require intentional processes and designs for organizations responsible for providing public services and creating public policies. An increasing amount of research focuses on developing comprehensive guidelines and techniques for industry and government groups to make sure they consider the range of issues in AI ethics and safety in their work.
An excellent example is the Public Policy Programme at The Alan Turing Institute under the direction of Dr David Leslie. Their work complements and supplements the Data Ethics Framework, which is a practical tool for use in any project initiation phase. Data Ethics and AI Ethics regularly overlap. The Public Policy Programme describes AI Ethics as “a set of values, principles, and techniques that employ widely accepted standards of right and wrong to guide moral conduct in the development and use of AI technologies. These values, principles, and techniques are intended both to motivate morally acceptable practices and to prescribe the basic duties and obligations necessary to produce ethical, fair, and safe AI applications. The field of AI ethics has largely emerged as a response to the range of individual and societal harms that the misuse, abuse, poor design, or negative unintended consequences of AI systems may cause.” They cite the following as some of the most consequential potential harms:
- Bias and discrimination
- Denial of individual autonomy, recourse, and rights
- Non-transparent, unexplainable, or unjustifiable outcomes
- Invasions of privacy
- Isolation and disintegration of social connection
- Unreliable, unsafe, or poor-quality outcomes
The Ethical Platform for the Responsible Delivery of an AI Project, strives to enable the “ethical design and deployment of AI systems using a multidisciplinary team effort. It demands the active cooperation of all team members both in maintaining a deeply ingrained culture of responsibility and in executing a governance architecture that adopts ethically sound practices at every point in the innovation and implementation lifecycle.” The goal is to “unite an in-built culture of responsible innovation with a governance architecture that brings the values and principles of ethical, fair, and safe AI to life.”
1. Leslie, D. (2019). Understanding artificial intelligence ethics and safety: A guide for the responsible design and implementation of AI systems in the public sector. The Alan Turing Institute.
2. Data Ethics Framework (2018).
Principled artificial intelligence
In January, 2020, the Berkman Klein Center released a report by Jessica Fjeld and Adam Nagy “Mapping Consensus in Ethical and Rights-Based Approaches to Principles for AI”, which summarizes contents of 36 documents on AI principles. This work acknowledges the surge in frameworks based on ethical and human rights to guide the development and use of AI technologies. The authors focus on understanding ethics efforts in terms of eight key thematic trends:
- Safety and security
- Transparency and explainability
- Fairness and non-discrimination
- Human control of technology
- Professional responsibility
- Promotion of human values
They report “our analysis examined the forty seven individual principles that make up the themes, detailing notable similarities and differences in interpretation found across the documents. In sharing these observations, it is our hope that policymakers, advocates, scholars, and others working to maximize the benefits and minimize the harms of AI will be better positioned to build on existing efforts and to push the fractured, global conversation on the future of AI toward consensus.”
Ben Shneiderman’s research emphasizes human autonomy as opposed to the popular notion of autonomous machines. The ideas are now available in the International Journal of Human–Computer Interaction. The abstract is as follows: “Well-designed technologies that offer high levels of human control and high levels of computer automation can increase human performance, leading to wider adoption. The Human-Centered Artificial Intelligence (HCAI) framework clarifies how to (1) design for high levels of human control and high levels of computer automation so as to increase human performance, (2) understand the situations in which full human control or full computer control are necessary, and (3) avoid the dangers of excessive human control or excessive computer control. The methods of HCAI are more likely to produce designs that are Reliable, Safe and Trustworthy (RST). Achieving these goals will dramatically increase human performance, while supporting human self-efficacy, mastery, creativity, and responsibility.”
Please join our discussions at the SIGAI Policy Blog.