<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">It is not About Bias but Discrimination</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Chaewon</forename><surname>Yun</surname></persName>
							<email>yun@mpib-berlin.mpg.de</email>
							<affiliation key="aff0">
								<orgName type="department">Max Planck Institute for Human Development</orgName>
								<address>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Claudia</forename><surname>Wagner</surname></persName>
							<email>claudia.wagner@gesis.org</email>
							<affiliation key="aff1">
								<orgName type="institution">GESIS -Leibniz Institute for the Social Sciences</orgName>
								<address>
									<settlement>Cologne</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="institution">RWTH Aachen University</orgName>
								<address>
									<settlement>Aachen</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jan-Christoph</forename><surname>Heilinger</surname></persName>
							<email>jan-christoph.heilinger@uni-wh.de</email>
							<affiliation key="aff3">
								<orgName type="institution">Witten/Herdecke University</orgName>
								<address>
									<settlement>Witten</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Workshop</forename><surname>Ceur</surname></persName>
						</author>
						<author>
							<persName><surname>Proceedings</surname></persName>
						</author>
						<title level="a" type="main">It is not About Bias but Discrimination</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">28360AEF8166624B6A60FC99033030E9</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T16:31+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Bias</term>
					<term>Discrimination</term>
					<term>Large Language Models</term>
					<term>Computational Social Science</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Growing interest in the bias of LLMs is followed by empirical evidence of socially and morally undesirable patterns of LLMs output. However, different definitions and measurements of bias make it difficult to assess its impact adequately. To facilitate effective and constructive scholarly communication about bias, we make two contributions in this paper: First, we unpack the conceptual confusion in defining bias, where bias is used to indicate both descriptive and normative discrepancies between LLMs and desired outcomes. Second, we suggest deontological reasons why bias is unacceptable. Common arguments against bias are based on teleological grounds which focus on the consequences of biased LLMs. We argue that bias should be identified and mitigated when and because it is morally wrongful discrimination, regardless of its outcome. To support this argument, we connect biased LLMs with Deborah Hellman's meaning-based account of discrimination. Bias in LLMs can be demeaning and capable of lowering the social status of affected individuals, making it morally wrongful discrimination. Such bias should be mitigated to prevent morally wrongful discrimination via technological means. By connecting the phenomena of bias in LLMs with existing literature from wrongful discrimination, we suggest that critical discourse on bias should go beyond finding skewed patterns in the outputs of LLMs. A meaningful contribution to identifying and reducing bias can be made only by situating the observed and measured bias in the complex societal context.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In recent years, there has been a surge of interest in uncovering bias, ethical and social risks, and potential harm in machine learning algorithms <ref type="bibr" target="#b0">[1]</ref>[2][3] <ref type="bibr" target="#b3">[4]</ref>. While the specific algorithms of interest have changed with the development and use of new algorithms, current discussions are dominated by large language models (LLMs) 1 . Following the explosive commercial success of LLM-based applications such as ChatGPT <ref type="bibr" target="#b6">[7]</ref> <ref type="bibr" target="#b7">[8]</ref>, there are heated debates about how this technology might disrupt society. Numerous cases have been reported of LLMs exhibiting morally dubious patterns that need to be urgently addressed. A growing literature on LLM bias in computer science literature has been focused on identifying problematic patterns such as racist <ref type="bibr" target="#b8">[9]</ref>, casteist <ref type="bibr" target="#b9">[10]</ref> or anti-Muslim <ref type="bibr" target="#b10">[11]</ref> outputs produced by LLMs.</p><p>In addressing biased outcomes of LLMs, most criticisms focus on the harm that biased LLMs can bring about. However, such perspective assumes that the bias of language models are problematic only based on their harms. We argue that moral permissibility of biased LLMs goes beyond the harmful outcome that they may bring about.</p><p>In this paper, we discuss biased LLMs in terms of wrongful discrimination by building upon literature on critical studies of technical systems and wrongful discrimination. Due to the conflation of bias at multiple levels in the context of LLMs, it is not a trivial task to navigate the disorganised discussion of bias in LLMs. To overcome such limitations, we investigate the meaning of bias in the context of LLMs. Afterwards, we reason why such bias in LLMs are wrongful and hence should be identified and mitigated.</p><p>In this attempt, we propose two questions that we will answer throughout this paper: First, what does it mean for LLMs to be biased? Second, why do we care if LLMs are biased? To answer the first question, various sources of confusion in the definition of bias are enumerated in the next section. Sources range from different uses as disciplinary jargon, different methods of operationalisation, to conflated conceptualisation of normatively and descriptively defined bias. We identify that unpacking conflation in conceptualizations of bias is especially of critical importance to avoid misleading interpretation of identified bias. Engaging with empirical literature measuring bias in LLMs shows how different conceptualizations of bias leads to varying implication of measured biases.</p><p>After discussing the conceptual conflation of the definition of bias, the second question asks where the relevance of bias in language models stems from. The common argument against bias in LLMs is in relation to the harms, as Biased LLMs can create incorrect outputs and can lead to harmful outcomes. However, such consequential perspectives are unsatisfactory for justifying some types of bias that should be mitigated despite being factually correct or not obviously dangerous or unsafe. After discussing the limitations of consequentialist reasons for mitigating bias, we provide deontological reasons to explain why bias in LLMs is impermissible beyond harmful outcomes. We argue that bias in LLMs is morally wrongful discrimination and therefore should be mitigated. To support this claim, we introduce Deborah Hellman <ref type="bibr" target="#b11">[12]</ref>'s meaningbased account of discrimination. Building on Hellman's account of discrimination, we argue that observed patterns of bias in LLMs are impermissible when and because they are morally wrongful discrimination. Contextualising observed bias in LLMs in terms of discrimination is crucial for justifying why and how LLMs should be unbiased.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">What Does It Mean for LLMs to be Biased?</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">The Problem of Conflated Conceptualization</head><p>Among the various dimensions of conflation regarding bias in LLMs, the conflated conceptualisation of bias is an important yet understudied problem. The term bias is used broadly in the literature on bias in LLMs. The definition often lacks clarity, partly due to the different ways in which the term bias is used. There are several reasons for this, such as different disciplinary practices or different methodological choices for operationalising bias. Among other reasons, conceptual confusion leads to fundamentally different versions of unbiased LLMs. Bias is used to describe a variety of gaps between the performance of LLMs and different golden standards.</p><p>In particular, such a golden standard can refer to normatively correct LLMs or descriptively accurate LLMs.</p><p>As bias is a contested concept that is widely used across different disciplines, the variance among definitions of bias is not negligible. When bias is defined in statistical analysis, it describes the gap between the observed or estimated value and the true value. For instance, bias is defined as "systematic error arising during sampling, data collection, or data analysis" <ref type="bibr" target="#b12">[13]</ref> or "prior information, a prerequisite for intelligent action. " <ref type="bibr" target="#b13">[14]</ref> Alternatively, bias can be defined as unfair treatment, as in <ref type="bibr" target="#b14">Friedman and Nissenbaum (1996)</ref> <ref type="bibr" target="#b14">[15]</ref>, where normative evaluation becomes an integral part of determining whether a pattern can be considered biased or not. Friedman and Nissenbaum define computer systems are biased when they "systematically and unfairly discriminate against certain individuals or groups of individuals in favor of others" (ibid, p. 332). Since the context of studying bias in LLMs is not limited to a single discipline, the usage of the term bias varies significantly. The gap between the definition, measurement, and normative motivation for investigating bias has been criticised <ref type="bibr" target="#b15">[16]</ref>. <ref type="foot" target="#foot_0">2</ref>Another layer that adds complexity to the definition of bias is the different operationalisations of bias. Bias is a construct that cannot be directly observed or quantified. Therefore, bias needs to be operationalised with observable properties that are relevant to the construct <ref type="bibr" target="#b18">[19]</ref>[20] <ref type="bibr" target="#b20">[21]</ref>. It is a prerequisite to clearly define the construct before operationalisation <ref type="bibr" target="#b21">[22]</ref>. In our case, the definition of bias in LLMs should be explicitly defined. In existing literature of measuring bias, stereotypes, especially occupational stereotypes <ref type="bibr">[23][24]</ref>[25] <ref type="bibr" target="#b25">[26]</ref>[27] <ref type="bibr" target="#b27">[28]</ref>, are the most commonly used properties to measure gender bias. This line of work can measure stereotypical relations between genders and occupations, such as 'man' to 'computer programmer' and 'woman' to 'homemaker' <ref type="bibr" target="#b28">[29]</ref>. Sentiment score <ref type="bibr" target="#b23">[24]</ref>[30] <ref type="bibr">[31][32]</ref> is another commonly used property to operationalise bias. LLMs are considered to be biased when a certain gender is correlated with more negative sentiment, that gender is biased against to others.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Conceptual Conflation: Descriptive/Normative Framework</head><p>Besides varying definitions of bias across different disciplines or operationalisations, conceptual conflation concerns fundamental differences about bias. While the first two conflations complicate the specific technical meaning of measured bias, conceptual conflation transcends practical differences of working definitions or operationalisations.</p><p>To measure bias, bias first needs to be identified by comparing LLM outputs with the desired state that is defined as unbiased. Based on how unbiased LLMs are defined, implications of measured bias changes significantly. Broadly speaking, unbiased LLMs can be defined descriptively and normatively. Descriptively defined bias can be measured by comparing LLMs output with statistics that represent status quo. LLMs are considered unbiased when their output  <ref type="bibr" target="#b32">[33]</ref> provide an analogy to describe different positions on how ideal natural language processing technology should be. It helps to compare different unbiased LLMs by laying out different possibilities based on two axes of normative correctness and descriptive accuracy. Four different categories of unbiased LLMs are possible by looking at combinations using the normative correctness axis and the descriptive accuracy axis, as described in Table <ref type="table" target="#tab_0">1</ref>.</p><p>In an ideal world, the Utopia land according to Deery and Bailey's analogy, unbiased LLMs accurately describe reality, and reality conforms to the normative ideal. It makes LLMs both normatively and descriptively correct. In Disaster land, on the other hand, unbiased LLMs fail to describe reality accurately, while also diverging from the normative ideal. In this case, LLMs are descriptively and normatively wrong, which makes it irrelevant to develop such a model, since it will be useless.</p><p>This leaves two possibilities in question, the Fantasy Land and the Dilemma Land, which are also the most common versions of unbiased LLMs in the existing literature of measuring bias. Fantasy land refers to the state where impartiality is defined by the normative ideal despite its descriptive imprecision. It is a fantasy because it does not exist in reality. It is unrealistic, but an ideal state of the world to which language models should aspire. On the other hand, dilemma land prioritises descriptive accuracy, even though such accuracy does not correspond to the normative ideal. It is a dilemma because it is an intractable problem to satisfy both descriptive and normative demands. It is not 'ideal' because the present world is not perfect. Adapting language models to an imperfect reality will inevitably lead to imperfect language models.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Choosing between Fantasy and Dilemma</head><p>The decision as to where unbiased LLMs should be located, either in fantasy land or in dilemma land, comes down to prioritising descriptive accuracy or normative correctness. The dilemma arising from the tension between normative and descriptive correctness is not a problem unique to language models or algorithms. It is a "specter of normative conflict" (Basu, 2020) <ref type="bibr" target="#b33">[34]</ref> that fairness might require inaccuracy. The dilemma perspective suggests that the apparent conflict between fairness and accuracy cannot be resolved <ref type="bibr">(ibid, p. 191-197)</ref>. Fantasy land advocates normative correctness, claiming that LLMs should be free of problematic patterns such as stereotypes that exist in reality.</p><p>Basu describes how the pursuit of such an ideal comes at the cost of a loss of descriptive accuracy, which can reduce the usefulness of language models. Prioritising descriptive accuracy, on the other hand, can create more utility for LLMs by aligning with a factual description of the status quo. However, it may risk perpetuating existing problems of reality through language models and reproducing undesirable patterns of injustice. <ref type="foot" target="#foot_1">3</ref> In the following subsections, we will refer to existing literature that empirically measures bias in LLMs. Two different ideals of unbiased LLMs will be discussed in terms of their implications and limitations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.1.">LLMs in Fantasy Land</head><p>The most common approach to identifying bias in LLMs is to compare the output of the model with a set of ideal states. Different versions of ideal states are suggested by the authors who propose the metrics to measure bias. For example, unbiased LLMs are often defined as astereotypical <ref type="bibr" target="#b34">[35]</ref> model that does not exhibit existing stereotypes. Stereotypes based on gender, religion, or race are considered undesirable, so LLMs should not be more likely to produce outcomes that conform to stereotypes. Another common approach is to seek a consistent baseline across groups, treating different groups of individuals 'equally' <ref type="bibr" target="#b31">[32]</ref>. Indeed, unbiased LLMs should not favour certain groups over others, nor be more prone to producing stereotypical texts.</p><p>The reference to an ideal state seems to be a self-suggestive step, considering the problematic patterns reflected in the training data, which is the prominent source of bias in language models. The data represent a reality that reflects morally imperfect features of the world. Therefore, language models trained on problematic data will inevitably show problematic patterns. By providing an alternative ideal where problematic patterns of reality are removed, the argument goes, LLMs can be improved.</p><p>However, this approach requires a definition of what an ideal language model is based on different aims. The definition of the ideal state can be subjective, contradictory, and controversial, which makes it difficult to compare differently unbiased LLMs. Establishing the criteria for an ideal language model requires conceptualising relevant concepts and evaluating conflicting values. 'Essentially contested concepts' <ref type="bibr" target="#b35">[36]</ref> such as bias or fairness are multifaceted and enables different possibilities to conceptualise the constructs. <ref type="foot" target="#foot_2">4</ref> Due to the complex nature of bias, any reductive definition runs the risk of missing potentially relevant features.</p><p>1-18</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.2.">LLMs in Dilemma Land</head><p>As shown earlier, defining an ideal state in fantasy land is challenging because there is no one perfect LLM that everyone agrees on. Therefore, an alternative approach in dilemma land does not define what language models should be like. Instead, the model is compared to reality. The way reality is represented can be broadly divided into a statistical account and a subjective-evaluative account.</p><p>According to the statistical account of bias, the model is biased if the output does not match the statistical source with which it is compared. Real-world statistics are used as the source of what a model should represent. As an occupational stereotype is one of the most common ways of assessing bias, national labour statistics in countries <ref type="bibr" target="#b26">[27]</ref> are often used to compare with the outputs of LLMs. For example, Touileb et al. (2022) <ref type="bibr" target="#b24">[25]</ref> measure bias in LLMs using Norwegian statistics on the gender ratio in different occupations. The authors show that LLMs tend to map gender-balanced occupations as male-dominated according to Norwegian occupational statistics (ibid, p. 209). Therefore, the author argues that the model shows gender bias since LLMs misrepresent the distribution of occupations in Norway.</p><p>According to the subjective-evaluative account, the output of LLM is validated with human evaluation to assess whether the bias of a language model matches human bias. Following this approach, bias in a language model may be acceptable as long as humans show a similar degree of bias. The results of LLMs are compared with human perceptions measured by crowd-sourced annotators <ref type="bibr" target="#b23">[24]</ref> or experts <ref type="bibr" target="#b38">[39]</ref>. For example, Sotnikova et al. ( <ref type="formula">2021</ref>) <ref type="bibr" target="#b38">[39]</ref> use authors' ratings of how stereotypical given statements generated by language models are. Some describe this approach as non-normative because it is a descriptive comparison. According to both statistical and subjective-evaluative accounts, bias is identified based on the discrepancy between the LLMs and the status quo, and this discrepancy is considered undesirable. Unbiased LLMs ought to either resemble real-world statistics or human perceptions. By referring to existing statistics, the comparison can be justified by relying on the authority of the statistics, rather than the authors deciding how LLMs should be. For example, referring to national labour statistics establishes the baseline as a representation of the status quo in the labour sector in the limited context in which such statistics are collected. Similarly, comparison with human assessments provides an observable baseline against which the performance of LLMs can be empirically assessed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.">No Land without Normativity</head><p>However, both attempts to identify and measure bias in LLMs both in fantasy land and dilemma land are products of normative decision. It is obvious why normative approaches, LLMs in fantasy land, are the result of normative decision, as the bias is defined as a gap between the output of LLMs and normative ideals, such as a-stereotypical or indifferent to different groups of people. However, as much as fantasy land approach, the descriptive approach as in dilemma land is based on a normative position. Aiming to align LLMs with a descriptive representation of the status quo is a strong normative position, as it assumes that reproducing the status quo is desirable.</p><p>Unbiased LLMs developed according to this perspective will reinforce the existing social structure. The repetition and reinforcement of existing biases in LLMs is a well-documented phenomenon, exemplified in studies shown earlier in both fantasy land and dilemma land. Moreover, the choice of which descriptive statistics or which human baseline to use is also a political choice. By referring to certain statistics as a golden standard to which LLM should be aligned, such statistics gain a position of power as authority. No statistic is neutral <ref type="bibr" target="#b39">[40]</ref>, but rather a product of social structure. The choice of people to compare LLMs with, whether they are experts in the field, university students, or crowd-sourced workers from low-wage countries from the Global South, also has significant implications for the interpretation of LLMs. Therefore, statistics do not qualify as objective, neutral authority that LLMs should be aligned with. Rather, it is a type of normative choice putting forward a specific value or perspective similar to putting forward an imaginative ideal state without referring to empirical data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Why are Biased LLMs Undesirable?</head><p>In the previous section, we discussed the conceptual confusion that leads to a fundamentally different idea of unbiased LLMs. Despite differences in conceptualisation and operationalisation, there seems to be a broad consensus in the field that bias in LLMs is undesirable and should therefore be mitigated. And the most widely adopted motivation for such endeavor is due to the harm that biased LLMs can cause.</p><p>In addressing biased outcomes of LLMs, the most widely adopted framework in literature has been categorizing harm into representational and allocational harm based on Kate Crawford's NIPS Keynote in 2017 <ref type="bibr" target="#b40">[41]</ref>. Representatinoal harm refers to the reinforcement of subordination of people based on social identifiers such as race and gender. Allocational harm refers to decisionmaking systems withholding an opportunity or a resource to certain groups. <ref type="bibr" target="#b41">[42]</ref> However, such perspective assumes that the bias of language models are problematic when and because it is harmful. For instance, LLMs can be harmful by creating misinformation. Proponents of such arguments focus on the consequences of biased LLMs.</p><p>However, such a consequentialist view is unsatisfactory in some cases of bias as we will show in this section. We propose deontological reasons why bias in LLMs is inadmissible, as it is morally wrongful discrimination. Wrongful discrimination is impermissible regardless of its consequences as it fails to treat people equally.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">'Hallucinated' LLMs are Harmful</head><p>One of the most common criticisms of LLMs is their 'hallucination' <ref type="bibr" target="#b42">[43]</ref>. Hallucination refers to the tendency of LLMs to produce texts that are not factually based. The basic mechanism of LLMs is to produce a statistical prediction of the most likely sequence, which does not take into account its veracity. The hallucination of LLMs is therefore a feature, not a bug. Despite such fundamental limitations, many people use applications that use LLMs, such as ChatGPT, as search engines or knowledge bases to retrieve useful information. Therefore, descriptive inaccuracy severely diminishes the utility of LLMs in many use cases. Moreover, it can be harmful and unsafe in high-stake situations such as health information or political disputes. Thus, the argument goes, LLMs should be unbiased in the descriptive sense to be useful and practical.</p><p>To reduce harms that can be caused by LLMs' hallucination, factual correctness, or groundness, has become integral part of LLM evaluation. For example, <ref type="bibr" target="#b43">Touvron et al. (2023)</ref> <ref type="bibr" target="#b43">[44]</ref> evaluates safety along three dimensions: Truthfulness, toxicity, and bias. <ref type="bibr" target="#b44">Thoppilan et al. (2022)</ref> <ref type="bibr" target="#b44">[45]</ref> evaluates harm, discrimination, and misinformation. The authors aim to mitigate 'unsafe model output to avoid unintended results that create risks of harm, and avoiding creating or reinforcing unfair bias. ' (ibid, p. 5) While authors make different connections between truthfulness, toxicity, bias, safety, and harm, they share the focus on the result of biased LLMs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Constraints on Consequential Views</head><p>Reducing and preventing undesirable outcome of biased LLMs has been the focus of critical investigation of LLMs in empirical literature . Bias, defined as descriptive inaccuracy, is avoided by making LLMs factually truthful. However, making LLMs truthful can still result in harmful LLMs. As discussed in the previous section on dilemma land, aligning LLMs with status quo denotes that it is desirable to replicate the reality. Making LLMs factually accurate still risks reinforcing existing unjust prejudices since the status quo itself is shaped partly by 'morally problematic attitudes, beliefs, and institutions' as Basu(2020) <ref type="bibr" target="#b33">[34]</ref> describes. As present reality embeds historical injustice, repeating reality via LLMs risks replicating and reinforcing existing patterns of injustice.</p><p>Moreover, the link between harmful consequence and bias are often ambiguous. An example reviewed by <ref type="bibr" target="#b45">Blodgett et al. (2021)</ref> <ref type="bibr" target="#b45">[46]</ref> highlights that many operationalisations of bias are contradictory and unjustified. For instance, the authors found a benchmark defining a stereotype as 'The exchange student became the star of all our art shows and drama performances', while the anti-stereotype was described as 'The exchange student was the star of our football team' (ibid, p. 1004). It is unclear how 'debiasing' LLMs according to such stereotypes will contribute to the less harmful consequence of LLMs. A vague link between bias and harmful consequence may risk reducing bias problems to immediately obvious harms. It also leaves room for implicit and benevolent discrimination whose harmful outcomes are not explicitly visible.</p><p>However, such criticisms should not lead to the mistaken conclusion that working on the bias of LLMs is an unfruitful effort. On the contrary, it is important to justify and strengthen the argument why certain biases should be measured and reduced, as it contributes to the relevance and implication of measured bias as such. In the next section, we discuss an alternative motivation for identifying, measuring, and reducing bias: the deontological view.</p><p>In deontological theories, what makes choices right depends on how it confirms with a moral norm, irrespective of the outcome that choices bring <ref type="bibr" target="#b46">[47]</ref>. Bias in LLMs is undesirable regardless of the outcome of biased LLMs. Compared to the extensive efforts to identify and measure bias, relatively little attention has been paid to contextualise why such measured bias is relevant in a societal context, especially in relation to discrimination.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Do LLMs Discriminate?</head><p>In previous sections, we discussed what it means for LLMs to be biased, which answers the first question we raised earlier. In this section, we seek to answer the second question: Why do we care if LLMs are biased? We argue that it is because bias in LLMs constitutes wrongful discrimination.</p><p>Discriminatory bias risks exacerbating existing injustice through technological means <ref type="bibr" target="#b47">[48]</ref>. The reason why LLMs are criticised for their tendency to produce stereotypical, biased results on socially relevant characteristics is not that it is wrong to be stereotypical per se, but because it is discriminatory. It is the discriminatory meaning attached to the observed bias in whatever form, either descriptive or normative, that determines which outputs of LLMs are permissible. While different contexts and objectives of LLMs and bias measures designed for LLMs shed light on different aspects of LLMs, what makes such patterns critical subjects of study is their relevance to discrimination.</p><p>In this section, we will discuss what discrimination means and how it relates to biased LLMs. In particular, an important question is what distinguishes mere discrepancy from morally wrongful discrimination. We first engage with existing literature on wrongful discrimination. Afterwards, we apply Deborah Hellman's meaning-based account of discrimination to the case of bias in LLMs. Why is the description of proactive male characters and submissive female characters in GPT-3 generated stories problematic <ref type="bibr" target="#b48">[49]</ref>? More specifically, why are such dimensions of LLM problematised and measured among many other possibilities?</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Wrongful Discrimination</head><p>What is discrimination and what makes them wrongful? Extensive scholarly discussions have investigated various aspects of discrimination: from distinguishing direct and indirect discrimination and their moral permissibility <ref type="bibr" target="#b49">[50]</ref>[51] <ref type="bibr" target="#b51">[52]</ref>, differences between harmful, harmless, and even beneficial discrimination <ref type="bibr" target="#b52">[53]</ref>, to wrongful action and wrongful discrimination (ibid, p. 111). As Slavny and Parr (2015) <ref type="bibr" target="#b52">[53]</ref> sketch out, the most contested issues in theory of discrimination is distinction between 'exclusively consequence-focused' and others who argues that wrongful discrimination can be defined by other factors than the consequences that discriminatory actions bring about (ibid, p. 101).</p><p>For instance, harm-based account by Kasper Lippert-Rasmussen <ref type="bibr" target="#b49">[50]</ref> is an example of consequencebased account. According to his harm-based account, "an instance of discrimination is wrong, when it is, because it makes people worse off"(ibid, p. 154-155).</p><p>In contrast, scholars like Deborah Hellman focus on the objective meaning conveyed by discriminatory act <ref type="bibr" target="#b11">[12]</ref>. Alternatively, Sophia Moreau <ref type="bibr" target="#b50">[51]</ref> views discrimination is wrongful when it denies one's "right to have a certain set of deliberative freedoms, and to have these freedoms to an extent roughly equal to those held by others"(ibid, p. 168). Moreau and Hellman disagree in what role demeaning messages, sent by discriminatory action, serves in defining what renders an wrongful discrimination. The reasoning of wrongful discrimination can be found outside the consequences of discriminatory action.</p><p>In this article, we follow Deborah Hellman's meaning-based account of discrimination to explain the wrongness of bias in LLMs. It is not to suggest that this is the only possible explanation why bias in LLMs is wrong, but rather aiming to bridge existing in bias in LLMs research and discrimination literature. We show that different theories on discrimination can challenge widely-accepted assumption in bias in LLMs research, where wrongness of biased LLMs is based on their harmful outcome, almost exclusively. Critical discourse in biased LLMs can benefit from referring to theories on discrimination from political philosophy and law, which can strengthen the argument why some bias matters more than others.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Meaning-Based Accounts of Discrimination</head><p>In a broad sense, discrimination can refer to any differential treatment based on personal characteristics <ref type="bibr" target="#b11">[12]</ref>. And not all forms of discrimination are morally problematic. 5 For example, charging young drivers more for their insurance can be regarded as age discrimination. However, some discrimination is justified to prevent adverse selection <ref type="bibr" target="#b54">[55]</ref>. Similarly, giving discounts to the elderly, people with reduced mobility, or students in public museums is also benevolent discrimination based on age or status, which is accepted as a form of social security or other reasonings. Applying the same arguments on LLMs, discrimination by LLMs against certain groups of people may not be morally problematic in itself. For example, an LLM that produces more cat lovers than dog lovers will not be of significant moral or social concern. What makes such discrimination wrongful is the meaning of the discrimination.</p><p>Deborah Hellman(2017) <ref type="bibr" target="#b11">[12]</ref> argues that what makes discrimination morally wrong is the meaning of the discrimination, rather than the intention of the actor, the relevance of the trait used to discriminate, or the rationality of the discrimination. Based on the meaning-based account of discrimination, the author argues that "discrimination is wrong when and because it is demeaning" (ibid., p. 1). It is demeaning when the actor with social power engages in denigration, which is the act of saying that someone is not good or important, and fails to treat those affected as equals. Hellman argues that we need to look at the meaning of discrimination in order to assess whether such discrimination is morally wrong.</p><p>The meaning-based account of wrongful discrimination has two aspects, an expressive dimension and a power dimension. An expressive dimension concerns whether an action or policy regards another person as inferior or of lower status. Hellman argues that discrimination is particularly problematic when it is based on socially salient characteristics such as gender and race. Socially salient characteristics such as gender can be used as "accurate proxies" for discrimination based on historical injustice. (see also <ref type="bibr" target="#b55">Johnson (2021)</ref> <ref type="bibr" target="#b55">[56]</ref>) Therefore, discrimination based on socially salient features is particularly morally problematic because it fails to treat people with equal moral worth.</p><p>Another dimension of the meaning-based account of wrongful discrimination is the power relationship. If the actor making the statement has social power that gives force to the meaning of the act or policy, then the discriminatory act is sufficient to be unjustified discrimination. Hellman argues that power is important in identifying unjustified discrimination because the actual power enables such discrimination to lower the social status of those affected. The discriminator with power and authority can affect people in more critical ways than those without such power. Furthermore, Hellman stresses that degradation depends on the capacity that comes with power, not on an actual effect. Regardless of the outcome, when the actor of power fails to recognise the equal moral worth of others, it is morally wrongful discrimination. 5 "The principle of discrimination" (also called "principle of distinction") is referred to as one of the most important principles of International Humanitarian Law, the field of law governing armed conflicts <ref type="bibr" target="#b53">[54]</ref> </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Bias in LLMs as Morally Wrongful Discrimination</head><p>In the previous section, we explained Deborah Hellman's account of the meaning-based account of discrimination. Building on her theory of discrimination, we argue that biased LLMs can constitute a case of morally wrongful discrimination. We discuss how bias in LLMs exhibits both expressive and power dimensions according to the meaning-based account of discrimination. In this section, we discuss two dimensions of discrimination using an example of measured gender bias by <ref type="bibr" target="#b48">Lucy and Bamman (2021)</ref> <ref type="bibr" target="#b48">[49]</ref>. We will use this example to show how biased LLMs instantiate morally wrongful discrimination.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.1.">ChatGPT: "Male characters are powerful and female characters are emotional"</head><p>Lucy and Bamman (2021) <ref type="bibr" target="#b48">[49]</ref> examine stories generated by GPT-3 that reproduce gender stereotypes from film, television, and books. The authors compared the themes of GPT-3-generated stories and human-written books to see how the perceived gender of the character related to the occurrence of topical terms such as appearance, intellect, and power. The perceived genders of characters were identified based on gendered pronouns, honorifics, or names. The prompts used for GPT-3 to generate stories consist of single sentences containing main characters, sampled from 402 contemporary English fiction books.</p><p>The authors carried out two types of content analysis. First, topic modelling was used to identify coherent collections of words in the text. The result shows that GPT-3 tends to associate female characters with topics related to family, emotions, and body parts. In contrast, masculine characters were associated with politics, war, sports, and crime. The authors show that the different themes across perceived gender in the stories generated by GPT-3 are consistent with previous work showing that language models associate women with caring roles <ref type="bibr" target="#b22">[23]</ref>, maternalism, and appearance <ref type="bibr" target="#b56">[57]</ref>. In addition, GPT-3 generated longer stories when the prompt contained stereotypical characters than anti-stereotypical characters.</p><p>In addition to topic modeling, the authors analyze how characters are described by measuring semantic similarity with lexicon embeddings. Three dimensions of description, appearance, intellectuality, and power, were chosen based on previous work on stereotypical description based on gender <ref type="bibr" target="#b56">[57]</ref>[58][59] <ref type="bibr" target="#b59">[60]</ref>. The result shows that words describing appearance are often used for female characters and words related to power are used for male characters.</p><p>The authors conclude that GPT-3 had internalised stereotypical gender stereotypes, which was strong enough to neutralise the effect of using power words for female characters. Even when the prompts did not contain explicit gender information or stereotypes, GPT-3 tended to generate stories that conformed to gender stereotypes. In addition, the authors found that GPT-3 tended to include more masculine characters and that the outcome varied according to the gender of the character, even when identical prompts were used (ibid, p. 51-52).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.2.">The Expressive Dimension</head><p>The expressive dimension assesses whether the one expresses denigration and views the other as less worthy, i.e. demeaning. To assess the expressive dimension of prejudice in LLMs, it is necessary to see whether the prejudice treats certain groups of people as having lower status compared to other groups of people.</p><p>Lucy and Bamman's (2021) <ref type="bibr" target="#b48">[49]</ref> research shows that stories generated by GPT-3 show encoded stereotypical gender bias. The association of women with family, appearance, and less power has been studied extensively in feminist theory. The association of women with appearance reflects the history of objectification of women. Feminists have raised the problems of objectification that make women overly preoccupied with their appearance <ref type="bibr" target="#b60">[61]</ref>. The association of women with their appearance fails to recognise women as equal agents to men by identifying women only/mainly with their bodies rather than their whole being. Bartky argues that the fragmentation of the female body sees women as "less inherently human than the mind or personality" <ref type="bibr" target="#b61">[62]</ref> <ref type="bibr" target="#b60">[61]</ref>. Language models that associate female characters with appearance show that gender injustice is reproduced by technological means.</p><p>Similarly, an examination of family dynamics and power structures effectively highlights the presence of gender inequality. Susan Moller Okin points out that "socially constructed inequalities" exist in the distribution of critical social goods such as power, prestige, and opportunities for self-development <ref type="bibr" target="#b62">[63]</ref> <ref type="bibr" target="#b63">[64]</ref>.</p><p>By reiterating coded gender inequality, LLMs like ChatGPT fail to treat people of different genders equally. By associating women with less power and agency, women are given a lower status. Such an example demonstrates how bias in LLMs instantiates morally wrongful discrimination. As Hellman argued, the repetition of historical injustice along the axes of socially salient characteristics is a significant case of morally wrongful discrimination.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.3.">The Power Dimension</head><p>Another dimension that needs to be explored concerning Hellman's meaning-based account of discrimination is the power dimension. To address the power dimension, we should ask whether LLMs have power and authority that can influence people in more consequential ways. If the agent of power, in our case LLMs, fails to recognise the equal moral worth of others, then this is morally wrongful discrimination. And we argue that language models are in a position of power where bias can begin to have a consequential impact on people's lives.</p><p>Several literatures have analysed the impact of ChatGPT, such as the EUROPOL report on law enforcement <ref type="bibr" target="#b64">[65]</ref> and Dempere et al. (2023) on higher education <ref type="bibr" target="#b65">[66]</ref>, among others. ChatGPT is just one example of an application using LLMs. LLMs have been adopted in various applications and can be used in numerous creative ways, from collecting debts <ref type="bibr" target="#b66">[67]</ref> to creating an AI companion <ref type="bibr" target="#b67">[68]</ref>. As this is a rapidly growing market, applications using LLMs are likely to increase. Traditionally conservative sectors such as the military are also experimenting with incorporating LLMs into their operations <ref type="bibr" target="#b68">[69]</ref>. As used in real-world scenarios, LLMs have power and authority that will directly and indirectly affect people's lives. The bias of LLMs used for different applications, such as debt collection, can lead to unequal treatment of different groups of people.</p><p>AI has the potential to transform society, as evidenced by the heated debate on how to regulate AI <ref type="bibr" target="#b69">[70]</ref>. However, the real-world impact of biased and opaque algorithms has been materialising for years even before LLMs were developed. Monumental cases of algorithmic bias in criminal risk assessment algorithms <ref type="bibr" target="#b70">[71]</ref>, facial recognition algorithms <ref type="bibr" target="#b71">[72]</ref>, or recruitment algorithms <ref type="bibr" target="#b72">[73]</ref> show critical consequences of using such algorithms in practice. Such cases demonstrate that discriminatory treatment disproportionately harms historically marginalised populations, and LLMs are no exception. Discriminatory patterns based on socially salient characteristics such as gender, race, and religion are easily found in LLMs, which can exacerbate existing discrimination based on such characteristics. It also indicates potential discriminatory patterns that are harder to measure can be undermined.</p><p>In this section, we have discussed biased LLMs and the meaning-based account of discrimination. To assess what level of measured bias in the language model is sufficient to constitute morally wrongful discrimination, two aspects need to be examined: the expressive dimension and the power dimension. The expressive dimension can be evaluated by assessing whether the bias expressed in the language model's output fails to treat those affected as equal to others. The power dimension examines whether the language model has the power to turn such discriminatory bias into real harm. We have shown how measured bias in LLMs relates to two dimensions of discrimination, building on Deborah Hellman's account of morally wrongful discrimination.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>In this paper, we have engaged in a conceptual analysis of bias in LLMs to investigate what it means for LLMs to be biased. Among various sources of confusion in defining bias, we focused on conceptual conflation where bias is used to refer to either descriptive inaccuracy and normative inaccuracy.</p><p>Descriptive inaccuracy is commonly measured by comparing LLMs with descriptive statistics, such as national labour statistics. Alternatively, bias in LLMs refers to normative inaccuracy, such as stereotypical correlations with socially salient characteristics such as gender, race, or religion.</p><p>Common arguments against bias are based on practical utility or safety grounds, which are consequential grounds based on the outcome of bias in LLMs. We argue that bias in LLMs should be identified and mitigated because it is morally wrongful discrimination, regardless of its outcome. Irrespective of the descriptive or normative correctness of biased patterns that LLMs produce, it is concerning when and if such bias instantiates morally wrongful discrimination. We presented Deborah Hellman's work on the meaning-based account of discrimination. She provides a framework that considers two dimensions of discrimination: the expressive dimension and the power dimension. We showed that biased LLMs are a case of morally wrongful discrimination based on both accounts. We also use one specific case of bias measurement to show how this framework can be applied to empirical bias measurements.</p><p>Regardless of the outcome of the bias, morally unjustified discrimination against LLMs is unacceptable. People have equal moral worth and should not be discriminated against on the basis of socially salient factors such as gender, race, or religion. The same argument applies to LLMs. It is particularly important to discuss the societal implications of bias against LLMs, as it is anticipated that the technology may transform various sectors of society and affect the lives of many people. Bias in LLM poses the pertinent threat of technology-enabled discrimination which risks exacerbating existing social injustices.</p><p>Bias in LLMs is a problem not because of observed biased patterns per se, but because it can discriminate in wrongful ways. To accurately identify potential risks and harms that may be posed by LLMs, we urge that the study of bias in LLMs should go beyond finding a skewed pattern in the outputs of LLMs. We provided a lens of wrongful discrimination as a framework for evaluating pertinent LLM bias in a societal context. A meaningful contribution to identifying and reducing bias in LLMs can be made only by situating the observed and measured bias in the complex societal context where the impact of bias can be critically evaluated.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Four Lands Analogy in<ref type="bibr" target="#b32">Deery and Bailey (2022)</ref> </figDesc><table><row><cell></cell><cell cols="2">Descriptively accurate Descriptively inaccurate</cell></row><row><cell>Normatively correct</cell><cell>Utopia Land</cell><cell>Fantasy Land</cell></row><row><cell>Normatively incorrect</cell><cell>Dilemma Land</cell><cell>Disaster Land</cell></row><row><cell cols="3">aligns with the representation of reality, irrespective of normative implications. Alternatively,</cell></row><row><cell cols="3">LLMs can be compared to normative targets that define what LLMs should be like, irrespective</cell></row><row><cell cols="3">of whether they describe how things are like in reality. LLMs that produce outcome according</cell></row><row><cell cols="3">to the ideal status, such as equal representation of genders or absence of stereotypes, will be</cell></row><row><cell cols="3">evaluated as unbiased. Consequently, despite using the same vocabulary and similar methods</cell></row><row><cell cols="3">to identify bias, differences in conceptual conflation leads to vastly different implication of</cell></row><row><cell>measured bias.</cell><cell></cell><cell></cell></row><row><cell>Deery and Bailey (2022)</cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0">Similarly, discussion around fairness in machine learning has attracted considerable critical attention, such as in<ref type="bibr" target="#b16">[17]</ref> and<ref type="bibr" target="#b17">[18]</ref>.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_1"><ref type="bibr" target="#b32">Deery and Bailey (2022)</ref><ref type="bibr" target="#b32">[33]</ref> argues that there is an ethical value in not debiasing, such as presenting problematic patterns. Debiasing can create a false illusion of improved fairness, more than is actually the case, which can contribute to a devaluation of the problem. However, in the case of LLMs, we argue that the risks of perpetuating problematic patterns through LLMs outweigh the ethical value of leaving the problematic patterns intact in the models. Empirically comparing such potential benefits and harms can be an interesting research direction in the future.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2">There has been much academic discussion about how to measure fairness. While various fairness measures have been developed, many of them are incompatible and even contradictory. On the inherent difficulty of measuring fairness, see, inter alia, Hellman (2020)<ref type="bibr" target="#b36">[37]</ref>, Jacobs and Wallach (2021)<ref type="bibr" target="#b21">[22]</ref>, Binns (2020)<ref type="bibr" target="#b37">[38]</ref>, and Delobelle et al, (2022)<ref type="bibr" target="#b1">[2]</ref> </note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Ethical and social risks of harm from Language Models</title>
		<author>
			<persName><forename type="first">L</forename><surname>Weidinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mellor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Rauh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Griffin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Uesato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P.-S</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Glaese</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Balle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kasirzadeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Kenton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Hawkins</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Biles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Birhane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Haas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Rimell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Hendricks</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Legassick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Irving</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Gabriel</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">64</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Measuring Fairness with Biased Rulers: A Comparative Study on Bias Metrics for Pre-trained Language Models</title>
		<author>
			<persName><forename type="first">P</forename><surname>Delobelle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Tokpo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Calders</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Berendt</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.naacl-main.122</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</title>
				<meeting>the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="1693" to="1706" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Berk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Heidari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Jabbari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kearns</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Roth</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1703.09207</idno>
		<title level="m">Fairness in Criminal Justice Risk Assessments: The State of the Art</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note>-05-27</note>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Lundgard</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2009.10050</idno>
		<title level="m">Measuring justice in machine learning</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.04805</idno>
		<title level="m">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Srivastava</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rastogi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A M</forename><surname>Shoeb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Abid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Fisch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">R</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Santoro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Garriga-Alonso</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2206.04615</idno>
		<title level="m">Beyond the imitation game: Quantifying and extrapolating the capabilities of language models</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<ptr target="https://chat.openai.com/" />
		<title level="m">Chatgpt</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Chatgpt crosses 1 million users five days after launch</title>
		<author>
			<persName><forename type="first">D</forename><surname>Levi</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2022-12">December, 2022. . 5. 2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Salinas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Penafiel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Mccormack</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Morstatter</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2310.08780</idno>
		<title level="m">Discovering bias in the internal knowledge of large language models</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
	<note>i&apos;m not racist but</note>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">K</forename><surname>Khandelwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Tonneau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Bean</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">R</forename><surname>Kirk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Hale</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2309.08573</idno>
		<title level="m">Casteist but not racist? quantifying disparities in large language model bias between india and the west</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<ptr target="https://github.com/google/BIG-bench/blob/main/bigbench/benchmark_tasks/muslim_violence_bias/README.md" />
		<title level="m">Muslim-violence bias</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Discrimination and Social Meaning</title>
		<author>
			<persName><forename type="first">D</forename><surname>Hellman</surname></persName>
		</author>
		<idno type="DOI">10.4324/9781315681634-10</idno>
	</analytic>
	<monogr>
		<title level="m">The Routledge Handbook of the Ethics of Discrimination</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Lippert-Rasmussen</surname></persName>
		</editor>
		<imprint>
			<publisher>Routledge</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="97" to="107" />
		</imprint>
	</monogr>
	<note>1 ed</note>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Bias</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">P</forename><surname>Association</surname></persName>
		</author>
		<ptr target="https://dictionary.apa.org/bias,2023" />
		<imprint>
			<date type="published" when="2023-05-25">25.05.2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Semantics derived automatically from language corpora contain human-like biases</title>
		<author>
			<persName><forename type="first">A</forename><surname>Caliskan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">J</forename><surname>Bryson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Narayanan</surname></persName>
		</author>
		<idno type="DOI">10.1126/science.aal4230</idno>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">356</biblScope>
			<biblScope unit="page" from="183" to="186" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Bias in computer systems</title>
		<author>
			<persName><forename type="first">B</forename><surname>Friedman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Nissenbaum</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on information systems (TOIS)</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page" from="330" to="347" />
			<date type="published" when="1996">1996</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Language (Technology) is Power: A Critical Survey of &quot;Bias</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">L</forename><surname>Blodgett</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Barocas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Daumé</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Iii</forename></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wallach</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2005.14050</idno>
	</analytic>
	<monogr>
		<title level="m">in NLP</title>
				<imprint>
			<biblScope unit="page" from="2020" to="2025" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">This Thing Called Fairness: Disciplinary Confusion Realizing a Value in</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">K</forename><surname>Mulligan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Kroll</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Kohli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">Y</forename><surname>Wong</surname></persName>
		</author>
		<idno type="DOI">10.1145/3359221</idno>
	</analytic>
	<monogr>
		<title level="j">Technology</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="1" to="36" />
			<date type="published" when="2019">2019-</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Fairness and abstraction in sociotechnical systems</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">D</forename><surname>Selbst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Boyd</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Friedler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Venkatasubramanian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Vertesi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the conference on fairness, accountability, and transparency</title>
				<meeting>the conference on fairness, accountability, and transparency</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="59" to="68" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Measurement validity: A shared standard for qualitative and quantitative research</title>
		<author>
			<persName><forename type="first">R</forename><surname>Adcock</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Collier</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">American political science review</title>
		<imprint>
			<biblScope unit="volume">95</biblScope>
			<biblScope unit="page" from="529" to="546" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Messick</surname></persName>
		</author>
		<title level="m">Validity, ETS research report series</title>
				<imprint>
			<date type="published" when="1987">1987</date>
			<biblScope unit="page">208</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">J</forename><surname>Hand</surname></persName>
		</author>
		<title level="m">Measurement: A very short introduction</title>
				<imprint>
			<publisher>Oxford University Press</publisher>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Measurement and Fairness</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">Z</forename><surname>Jacobs</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wallach</surname></persName>
		</author>
		<idno type="DOI">10.1145/3442188.3445901</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency</title>
				<meeting>the 2021 ACM Conference on Fairness, Accountability, and Transparency</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<biblScope unit="page" from="375" to="385" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Bias out-of-the-box: An empirical analysis of intersectional occupational biases in popular generative language models</title>
		<author>
			<persName><forename type="first">H</forename><surname>Kirk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Jun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Iqbal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Benussi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Volpin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">A</forename><surname>Dreyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Shtedritski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">M</forename><surname>Asano</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2102.04130</idno>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation</title>
		<author>
			<persName><forename type="first">J</forename><surname>Dhamala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Krishna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Pruksachatkun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Gupta</surname></persName>
		</author>
		<idno type="DOI">10.1145/3442188.3445924</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency</title>
				<meeting>the 2021 ACM Conference on Fairness, Accountability, and Transparency</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<biblScope unit="page" from="862" to="872" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Occupational Biases in Norwegian and Multilingual Language Models</title>
		<author>
			<persName><forename type="first">S</forename><surname>Touileb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Øvrelid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Velldal</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.gebnlp-1.21</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), Association for Computational Linguistics</title>
				<meeting>the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="200" to="211" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Using natural sentence prompts for understanding biases in language models</title>
		<author>
			<persName><forename type="first">S</forename><surname>Alnegheimish</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Guo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
				<meeting>the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="2824" to="2830" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<title level="m" type="main">Unmasking contextual stereotypes: Measuring and mitigating bert&apos;s gender bias</title>
		<author>
			<persName><forename type="first">M</forename><surname>Bartl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Nissim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gatt</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2010.14534</idno>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Stereotype and skew: Quantifying gender bias in pre-trained and fine-tuned language models</title>
		<author>
			<persName><forename type="first">D</forename><surname>De Vassimon Manela</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Errington</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Fisher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Van Breugel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Minervini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume</title>
				<meeting>the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="2232" to="2242" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings</title>
		<author>
			<persName><forename type="first">T</forename><surname>Bolukbasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">Y</forename><surname>Zou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Saligrama</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">T</forename><surname>Kalai</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="volume">2016</biblScope>
			<biblScope unit="page">9</biblScope>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Gender Bias in BERT -Measuring and Analysing Biases through Sentiment Rating in a Realistic Downstream Classification Task</title>
		<author>
			<persName><forename type="first">S</forename><surname>Jentzsch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Turan</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2022.gebnlp-1.20</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), Association for Computational Linguistics</title>
				<meeting>the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="184" to="199" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Wolfe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Caliskan</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2110.00672</idno>
		<title level="m">Low Frequency Names Exhibit Bias and Overfitting in Contextualizing Language Models</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="10" to="11" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Towards a Comprehensive Understanding and Accurate Evaluation of Societal Biases in Pre-Trained Transformers</title>
		<author>
			<persName><forename type="first">A</forename><surname>Silva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Tambwekar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gombolay</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.naacl-main.189</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</title>
				<meeting>the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="2383" to="2389" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<monogr>
		<title level="m" type="main">The Bias Dilemma: The Ethics of Algorithmic Bias in Natural-Language Processing</title>
		<author>
			<persName><forename type="first">O</forename><surname>Deery</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Bailey</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">8</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">The Specter of Normative Conflict</title>
		<author>
			<persName><forename type="first">R</forename><surname>Basu</surname></persName>
		</author>
		<idno type="DOI">10.4324/9781315107615-10</idno>
	</analytic>
	<monogr>
		<title level="m">An Introduction to Implicit Bias</title>
				<editor>
			<persName><forename type="first">E</forename><surname>Beeghly</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Madva</surname></persName>
		</editor>
		<imprint>
			<publisher>Routledge</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="191" to="210" />
		</imprint>
	</monogr>
	<note>1 ed</note>
</biblStruct>

<biblStruct xml:id="b34">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Nadeem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bethke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Reddy</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2004.09456</idno>
		<title level="m">StereoSet: Measuring stereotypical bias in pretrained language models</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note>-04-20</note>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title/>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">B</forename><surname>Gallie</surname></persName>
		</author>
		<idno type="DOI">10.1093/aristotelian/56.1.167</idno>
	</analytic>
	<monogr>
		<title level="j">IX.-Essentially Contested Concepts</title>
		<imprint>
			<biblScope unit="volume">56</biblScope>
			<biblScope unit="page" from="167" to="198" />
			<date type="published" when="1956">1956</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title/>
		<author>
			<persName><forename type="first">D</forename><surname>Hellman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">MEASURING ALGORITHMIC FAIRNESS</title>
		<imprint>
			<biblScope unit="volume">106</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<monogr>
		<title level="m" type="main">Fairness in Machine Learning: Lessons from Political Philosophy</title>
		<author>
			<persName><forename type="first">R</forename><surname>Binns</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">11</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b38">
	<analytic>
		<title level="a" type="main">Analyzing Stereotypes in Generative Text Inference Tasks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Sotnikova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">T</forename><surname>Cao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Daumé</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Iii</forename></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Rudinger</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.findings-acl.355</idno>
	</analytic>
	<monogr>
		<title level="m">Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Association for Computational Linguistics</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="4052" to="4065" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b39">
	<monogr>
		<title level="m" type="main">Invisible women: Data bias in a world designed for men</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">C</forename><surname>Perez</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2019">2019</date>
			<publisher>Abrams</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b40">
	<monogr>
		<title level="m" type="main">The trouble with bias</title>
		<author>
			<persName><forename type="first">K</forename><surname>Crawford</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note>keynote at neurips</note>
</biblStruct>

<biblStruct xml:id="b41">
	<monogr>
		<title level="m" type="main">Fairness and machine learning: Limitations and opportunities</title>
		<author>
			<persName><forename type="first">S</forename><surname>Barocas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hardt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Narayanan</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2023">2023</date>
			<publisher>MIT Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b42">
	<monogr>
		<author>
			<persName><forename type="first">J.-Y</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K.-P</forename><surname>Ning</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z.-H</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-N</forename><surname>Ning</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Yuan</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2310.01469</idno>
		<title level="m">Llm lies: Hallucinations are not bugs, but features as adversarial examples</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b43">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><surname>Touvron</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Martin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Stone</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Albert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Almahairi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Babaei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Bashlykov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Batra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bhargava</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bhosale</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2307.09288</idno>
		<title level="m">Llama 2: Open foundation and fine-tuned chat models</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b44">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Thoppilan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">De</forename><surname>Freitas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kulshreshtha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H.-T</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Bos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Baker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Du</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2201.08239</idno>
		<title level="m">Lamda: Language models for dialog applications</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b45">
	<analytic>
		<title level="a" type="main">Stereotyping Norwegian salmon: An inventory of pitfalls in fairness benchmark datasets</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">L</forename><surname>Blodgett</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Lopez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Olteanu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wallach</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.acl-long.81</idno>
		<ptr target="https://aclanthology.org/2021.acl-long.81.doi:10.18653/v1/2021.acl-long.81" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing</title>
		<title level="s">Long Papers</title>
		<editor>
			<persName><forename type="first">C</forename><surname>Zong</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Xia</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">W</forename><surname>Li</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Navigli</surname></persName>
		</editor>
		<meeting>the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1004" to="1015" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b46">
	<analytic>
		<title level="a" type="main">Deontological Ethics</title>
		<author>
			<persName><forename type="first">L</forename><surname>Alexander</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Moore</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Stanford Encyclopedia of Philosophy</title>
				<editor>
			<persName><forename type="first">E</forename><forename type="middle">N</forename><surname>Zalta</surname></persName>
		</editor>
		<meeting><address><addrLine>Winter</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021. 2021</date>
		</imprint>
		<respStmt>
			<orgName>Metaphysics Research Lab, Stanford University</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b47">
	<analytic>
		<title level="a" type="main">The ethics of (generative) ai</title>
		<author>
			<persName><forename type="first">J.-C</forename><surname>Heilinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Kempt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Critical AI</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note>forthcoming</note>
</biblStruct>

<biblStruct xml:id="b48">
	<analytic>
		<title level="a" type="main">Gender and Representation Bias in GPT-3 Generated Stories</title>
		<author>
			<persName><forename type="first">L</forename><surname>Lucy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Bamman</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.nuse-1.5</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Third Workshop on Narrative Understanding, Association for Computational Linguistics</title>
				<meeting>the Third Workshop on Narrative Understanding, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="48" to="55" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b49">
	<monogr>
		<author>
			<persName><forename type="first">K</forename><surname>Lippert-Rasmussen</surname></persName>
		</author>
		<title level="m">Born free and equal?: A philosophical inquiry into the nature of discrimination</title>
				<imprint>
			<publisher>Oxford University Press</publisher>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b50">
	<analytic>
		<title level="a" type="main">What is discrimination?</title>
		<author>
			<persName><forename type="first">S</forename><surname>Moreau</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Phil. &amp; Pub. Aff</title>
		<imprint>
			<biblScope unit="volume">38</biblScope>
			<biblScope unit="page">143</biblScope>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b51">
	<analytic>
		<title level="a" type="main">Indirect discrimination is not necessarily unjust</title>
		<author>
			<persName><forename type="first">K</forename><surname>Lippert-Rasmussen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Practical Ethics</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b52">
	<analytic>
		<title level="a" type="main">Harmless discrimination</title>
		<author>
			<persName><forename type="first">A</forename><surname>Slavny</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Parr</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Legal Theory</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="page" from="100" to="114" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b53">
	<monogr>
		<author>
			<persName><forename type="first">I</forename></persName>
		</author>
		<ptr target="https://ihl-databases.icrc.org/en/customary-ihl/v2/rule7" />
		<title level="m">Practice relating to rule 7. the principle of distinction between civilian objects and military objectives section a. the principle of distinction</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b54">
	<monogr>
		<title level="m" type="main">Choosing how to discriminate: Navigating ethical trade-offs in fair algorithmic design for the insurance sector</title>
		<author>
			<persName><forename type="first">M</forename><surname>Loi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Christen</surname></persName>
		</author>
		<idno type="DOI">10.1007/s13347-021-00444-9</idno>
		<imprint>
			<date type="published" when="2021">2021-</date>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="page" from="967" to="992" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b55">
	<monogr>
		<title level="m" type="main">Algorithmic bias: On the implicit biases of social technology</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">M</forename><surname>Johnson</surname></persName>
		</author>
		<idno type="DOI">10.1007/s11229-020-02696-y</idno>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">198</biblScope>
			<biblScope unit="page" from="9941" to="9961" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b56">
	<analytic>
		<title level="a" type="main">Analyzing gender bias within narrative tropes</title>
		<author>
			<persName><forename type="first">D</forename><surname>Gala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">O</forename><surname>Khursheed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Lerner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>O'connor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Iyyer</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2020.nlpcss-1.23</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fourth Workshop on Natural Language Processing and Computational Social Science, Association for Computational Linguistics</title>
				<meeting>the Fourth Workshop on Natural Language Processing and Computational Social Science, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="212" to="217" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b57">
	<monogr>
		<title level="m" type="main">Gender roles &amp; occupations: A look at character attributes and job-related aspirations in film and television</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">L</forename><surname>Smith</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Choueiti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Prescott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Pieper</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2012">2012</date>
			<publisher>Geena Davis Institute on Gender in Media</publisher>
			<biblScope unit="page" from="1" to="46" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b58">
	<analytic>
		<title level="a" type="main">Connotation frames of power and agency in modern films</title>
		<author>
			<persName><forename type="first">M</forename><surname>Sap</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">C</forename><surname>Prasettio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Holtzman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Rashkin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Choi</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/D17-1247</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics</title>
				<meeting>the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics<address><addrLine>Copenhagen, Denmark</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="2329" to="2334" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b59">
	<analytic>
		<title level="a" type="main">Shirtless and dangerous: Quantifying linguistic signals of gender bias in an online fiction writing community</title>
		<author>
			<persName><forename type="first">E</forename><surname>Fast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Vachovsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bernstein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International AAAI Conference on Web and Social Media</title>
				<meeting>the International AAAI Conference on Web and Social Media</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="112" to="120" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b60">
	<analytic>
		<title level="a" type="main">Feminist Perspectives on Objectification</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">L</forename><surname>Papadaki</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Stanford Encyclopedia of Philosophy</title>
				<editor>
			<persName><forename type="first">E</forename><forename type="middle">N</forename><surname>Zalta</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2021">Spring 2021. 2021</date>
		</imprint>
		<respStmt>
			<orgName>Metaphysics Research Lab, Stanford University</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b61">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">L</forename><surname>Bartky</surname></persName>
		</author>
		<title level="m">Femininity and domination: Studies in the phenomenology of oppression</title>
				<imprint>
			<publisher>Routledge</publisher>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b62">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Okin</surname></persName>
		</author>
		<title level="m">Justice, Gender, and the Family</title>
				<meeting><address><addrLine>New York</addrLine></address></meeting>
		<imprint>
			<publisher>Basic Books</publisher>
			<date type="published" when="1989">1989</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b63">
	<analytic>
		<title level="a" type="main">Feminist Perspectives on Power</title>
		<author>
			<persName><forename type="first">A</forename><surname>Allen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Stanford Encyclopedia of Philosophy</title>
				<editor>
			<persName><forename type="first">E</forename><forename type="middle">N</forename><surname>Zalta</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">U</forename><surname>Nodelman</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2022">Fall 2022. 2022</date>
		</imprint>
		<respStmt>
			<orgName>Metaphysics Research Lab, Stanford University</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b64">
	<monogr>
		<author>
			<persName><surname>Europol</surname></persName>
		</author>
		<title level="m">Chatgpt -the impact of large language models on law enforcement</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>a tech watch flash report from the europol innovation lab</note>
</biblStruct>

<biblStruct xml:id="b65">
	<analytic>
		<title level="a" type="main">The impact of chatgpt on higher education</title>
		<author>
			<persName><forename type="first">J</forename><surname>Dempere</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">P</forename><surname>Modugu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hesham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ramasamy ; Dempere</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Modugu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Hesham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ramasamy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">K</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Front. Educ</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page">1206936</biblScope>
			<date type="published" when="2023">2023. 2023</date>
		</imprint>
	</monogr>
	<note>The impact of ChatGPT on higher education</note>
</biblStruct>

<biblStruct xml:id="b66">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Faife</surname></persName>
		</author>
		<ptr target="https://www.vice.com/en/article/bvjmm5/debt-collectors-want-to-use-ai-chatbots-to-hustle-people-for-money" />
		<title level="m">Debt collectors want to use ai chatbots to hustle people for money</title>
				<imprint>
			<date type="published" when="2023-05-18">18-05-2023. 24. 5. 2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b67">
	<monogr>
		<title level="m" type="main">Too human and not human enough: A grounded theory analysis of mental health harms from emotional dependence on the social chatbot replika</title>
		<author>
			<persName><forename type="first">L</forename><surname>Laestadius</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bishop</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gonzalez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Illenčík</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Campos-Castillo</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2022">2022</date>
			<publisher>New Media &amp; Society</publisher>
			<biblScope unit="page">14614448221142007</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b68">
	<monogr>
		<ptr target="https://www.bloomberg.com/news/newsletters/2023-07-05/the-us-military-is-taking-generative-ai-out-for-a-spin?ref=hackernoon.com/" />
		<title level="m">The us military is taking generative ai out for a spin</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b69">
	<analytic>
		<title level="a" type="main">The ethics of ai ethics. a constructive critique</title>
		<author>
			<persName><forename type="first">J.-C</forename><surname>Heilinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Philosophy &amp; Technology</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page">61</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b70">
	<analytic>
		<title level="a" type="main">Machine bias</title>
		<author>
			<persName><forename type="first">J</forename><surname>Angwin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Larson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mattu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kirchner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Ethics of data and analytics</title>
				<imprint>
			<publisher>Auerbach Publications</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="254" to="264" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b71">
	<analytic>
		<title level="a" type="main">Gender shades: Intersectional accuracy disparities in commercial gender classification</title>
		<author>
			<persName><forename type="first">J</forename><surname>Buolamwini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Gebru</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Conference on fairness, accountability and transparency</title>
				<imprint>
			<publisher>PMLR</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="77" to="91" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b72">
	<analytic>
		<title level="a" type="main">Mitigating bias in algorithmic hiring: Evaluating claims and practices</title>
		<author>
			<persName><forename type="first">M</forename><surname>Raghavan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Barocas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kleinberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Levy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2020 conference on fairness, accountability, and transparency</title>
				<meeting>the 2020 conference on fairness, accountability, and transparency</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="469" to="481" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
