=Paper=
{{Paper
|id=Vol-3276/SSS-22_FinalPaper_83
|storemode=property
|title=Advancing Fairness in Public Funding Using Domain
Knowledge
|pdfUrl=https://ceur-ws.org/Vol-3276/SSS-22_FinalPaper_83.pdf
|volume=Vol-3276
|authors=Thomas Goolsby,Sheikh Rabiul Islam,Ingrid Russell
}}
==Advancing Fairness in Public Funding Using Domain
Knowledge==
Advancing Fairness in Public Funding Using Domain Knowledge Thomas Goolsby, 1 Sheikh Rabiul Islam,2 Ingrid Russell3 University of Hartford1,2,3 goolsby@hartford.edu,1 shislam@hartford.edu,2 irussell@hartford.edu3 In this work, we investigate the federal allocation of Abstract funds for public transportation by keeping fairness issues in Artificial Intelligence (AI) has become an integral part of mind. When we talk about fairness in this paper, we are several modern-day solutions impacting many aspects of our speaking to the mitigation of hidden bias that can be lives. Therefore, it is of paramount importance that AI- introduced inadvertently during the machine learning powered applications are fair and unbiased. In this work, we process. Ultimately, fairness in AI regarding this paper, propose a domain knowledge infused AI-based system for looks to employ known techniques to eliminate hidden bias. public funding allocation in the transportation sector by Furthermore, the FTA is supposed to distributes public funds keeping potential fairness-related pitfalls in mind. In the in an equitable fashion, as defined in Title VI of the Civil transportation sector, in general, the funding allocation in a particular geographic area corresponds to the population in Rights Act of 1964, thus it is our goal to replicate that equity that area. However, we found that areas with high diversity through using a machine learning approach that mitigates index have a higher public transit ridership, and this is a bias that may fabricate during the process. In the crucial piece of information to consider for an equitable transportation sector, in general, the funding allocation in a distribution of funding. Therefore, in our proposed approach, particular geographic area corresponds to the population in we use the above fact as domain knowledge to guide the that area. However, we found that areas with high diversity developed model to detect and mitigate the hidden bias in index have a higher public transit ridership and a crucial funding distribution. Our intervention has the potential to piece of information to consider for an equitable distribution improve the declining rate of public transit ridership which of funding. Therefore, in our proposed approach, we use the has decreased by 3% in the last decade. An increase in public above fact as domain knowledge to guide the developed transit ridership has the potential to reduce the use of personal vehicles as well as to reduce the carbon footprint. model to detect and mitigate the hidden bias in funding distribution. Keywords Domain knowledge is a high-level, abstract concept domain knowledge, artificial intelligence, machine learning, that encompasses the problem area. For example, in a car federal funding, federal transit administration, public classification problem from images, the domain knowledge transportation, bias, fairness could be that a convertible has no roof, or a sedan has four doors, etc. However, encoding this domain knowledge in a Introduction black-box model is challenging. Bias can occur during data collection, data preprocessing, algorithm processing, or the Available public data establishes a set of criteria based on act of making an algorithmic decision. Through the census data to determine how funding is tabulated and comparison of machine learning models with and without granted to federal transit agencies in major Urbanized Areas domain knowledge, this work measures the effectiveness of (UZAs) in the United States (Giorgis 2020). The current domain knowledge integration. We use different machine system takes into consideration a range of census-based learning classifiers such as Random Forests (RF), Extra criteria (Giorgis 2020) and supposed to take into Trees (ET), and K-nearest neighbor, to name a few, for the consideration of protected attributes defined in Title VI of experiments. We also use IBM AI Fairness 360 to detect and the Civil Rights Act of 1964 (Title VI 1964) among other mitigate bias and evaluate different standard fairness metrics determinants. This raises the question as to how and if it is to further emphasize the effect of incorporating domain possible to use AI-based systems to allocate federal funding knowledge into our proposed approach. in an equitable fashion while abiding by Title VI guidelines. ___________________________________ In T. Kido, K. Takadama (Eds.), Proceedings of the AAAI 2022 Spring Symposium “How Fair is Fair? Achieving Wellbeing AI”, Stanford University, Palo Alto, California, USA, March 21–23, 2022. Copyright © 2022 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 66 AAAI Spring Symposium ‘22, March 2022, Palo Alto, California Goolsby et al. Background non-white groups (Omi and Winant 2014). With observing domain knowledge in the allocation of federal funds, we A good amount of work has been conducted on the domain must be extremely cautious of these implications. Link and of bias and fairness in AI. Mehrabi et. al. developed a general Phelan provide a clear definition of what stigma is. They survey exploring this topic. They emphasize the importance define stigma as “the co-occurrence of labeling, of a continuous feedback loop between data, algorithms, and stereotyping, separation (segregation), status debasement, users (Mehrabi et. al. 2021). This accentuates how and discrimination” (AI Fairness 360 2021). By susceptible AI algorithms are to bias. This bias can be understanding the systemic instillment of stigma in racial introduced when data is collected. It’s important to be aware categories, this work will look for ways to introduce fair of the kinds of bias that can occur as well. domain knowledge without reifying those dangerous Seeing as unique the interaction between data and stigmas. This ultimately leads to some implications in the users is, there are two biases in particular that apply to the development of a fair AI algorithm for allocating federal data that we are working with. One of those biases is the funds for public transportation. omitted variable bias, which occurs when one or more Public transit agencies are supposed to abide by the important variables are left out of the model (Riegg 2008, Title VI of the Civil Rights Act of 1964. The Federal Transit Mustard 2003, Clarke 2005). A simple example of this type Agency (FTA) follows closely with the rules written in Title of bias in play could be with an algorithm that is trained to VI which protects people from discrimination based on race, predict when users will unsubscribe from a company’s color, and national origin in programs and activities service. A possible omitted variable here could be that a receiving federal financial assistance (Title VI 1964). Within strong competitor enters the market that the algorithm was this work, we also abide by these laws to develop a legally unaware of (Mehrabi et. al. 2021). The introduction of this applicable AI for allocating federal funds, and investigate competitor would be the omitted variable, which would then the disparities. A fair and unbiased AI algorithm for lead to bias being introduced in the algorithm when it tries allocating federal funds for public transportation could to predict when a particular customer would unsubscribe. further help combat the national decline in public transit The other important form of bias is aggregation bias. ridership. William J Mallet from the Congressional Research Aggregation bias occurs when a one-size-fits-all model is Service emphasized that public transit ridership has declined used for groups with different conditional distributions nationally by 7% over the last decade (Mallett et. al. 2018). (Suresh and Guttag 2019). Both the omitted variable bias and Competing transportation options like personal vehicles, aggregation bias are unique in machine learning applications ride-sourcing (e.g., Uber), and bike-sharing are partially at since they are technical biases that can occur at any point in the forefront of the national decline. Some solutions the machine learning process. This leads to them being proposed by this work are incentive funding, raising user particularly difficult to counteract. The authors of this work fees on personal automobiles, and improving general discussed how the introduction of discrimination in AI is funding for public transportation (Mallett et. al. 2018). That unique since it is a direct interaction between data and users. is where this work comes in; to attempt and answer the Again, domain knowledge is being used to attempt to question of if an AI algorithm embedded with fairness can counteract specific instances of bias like this. contribute to a more equitable solution. Furthermore, it is important to understand the problematic nature of introducing racial categories to machine learning. Programmers face a unique dilemma in Experiments and Results this problem domain since they can either be blind to racial This project explores how domain knowledge can be group disparities or be conscious of those racial categories integrated to ensure fairness in AI. A publicly available (Benthall et. al. 2019). However, regardless of which path dataset on the allocation of federal funds to public the programmer chooses to go down, both options ultimately transportation agencies is being used (Giorgis 2020). This reify the negative and inaccurate implications of race in dataset is the basis on which this exploration and application society. Moreover, this observes differences in races in the of machine learning is being used. The dataset includes United States, which is inherently problematic. Race official data from 2014-2019 on 449 Federal Transit Agency differences are created by ascribing race classifications onto (FTA) defined public transportation agencies in the individuals who were previously racially unspecified. This continental United States, Alaska, Hawaii, and Puerto Rico. ultimately leads to the newly racially classified individuals The dataset is read into RStudio using R version 4.1.0 and being linked to stereotyped and stigmatized beliefs about Python version 3.8.0. The R programming language is being used in a simple R script while Python is being used in 67 isolated code chunks within an R markdown (Rmd) file. For an effective incorporation of census-based domain bias detection and mitigation, we use IBM AI 360 fairness knowledge. To convert the diversity index by county into open-source toolkit (AI Fairness 360 2021). class, the distribution of the values is being evaluated. As seen in Figure 1, there is a great number of observations Data Preprocessing (roughly 55%) that have a diversity index between 0.25 and This dataset (Giorgis 2020) is being preprocessed into a 0.5. summarized form which gives totals for individual transit agencies per year. The data started off with 42 columns and 36,656 rows. Empty columns and rows are deleted which then leads to the dataset containing 40 columns and 18,673 rows. The overall dataset is then split up into separate data containers for individual years; thus, producing six separate datasets for six individual years (2014-2019). Each of the six datasets contains 13 columns and anywhere from 440-444 rows depending on the year. Finally, separate data containers are combined back into a single data container which now consists of summarized data for every given FTA UZA per year. This summarized data container containing all data from 2014-2019 has 13 columns and 2,615 rows. Figure 1: Histogram of diversity index by county Furthermore, the measure of operating expenses is converted to classes in which supervised machine learning Therefore, since the distribution looked as such can take place. Operating expense classes are determined by with 4 bins, the diversity index by county was split into 4 examining the distribution of operating expenses across classes. The first class is “Very Low”, which constitutes all transit agencies. It was found that the distribution was observations that have a diversity index greater than or equal skewed towards the lower end (< $100,000,000). However, to 0 and less than 0.25. The next class was “Low” which is it is also found that the total amount of operating expenses made up of observations that have a diversity index greater for a specific transit agency has a high correlation, roughly than or equal to 0.25 while also less than 0.5. Then the 95%, with the population of its service area. These are the “Moderate” class includes all observations that have a factors that lead to the current distribution of operating diversity index greater than or equal to 0.5 and less than 0.75. expense level classes. Data is currently being utilized from Finally, the last class was “High” which included the the 2020 national census, specifically diversity indices at the remaining observations, or those that have a diversity index state and county level. Data engineering techniques are that is greater than or equal to 0.75 and less than 1 (since this being used to incorporate both state and county-level is the maximum value possible). These class bounds are also diversity indices into the summarized public funds' supported by the fact that the diversity index by county had allocation dataset. the largest correlation with the population of a particular To evaluate the fairness of the models with domain UZA. It’s found that the correlation between these two knowledge, diversity index by county had to be sorted into values is 0.26, which again was the highest correlation that classes. Diversity index by county is being used as the diversity index by county had with any other variable in the primary form of domain knowledge here since it provides a data set (see Figure 2). Furthermore, one of the variables that clearer vision of the diversity across populations. The has the highest correlation with primary UZA population is diversity index serves as a measure of how likely it is that unlinked passenger trips (0.76). Total unlinked passenger two individuals chosen at random from a population are trips serves as an FTA defined measure of public from different races and ethnic groups (Bureau et. al. 2021). transportation ridership. Therefore, we can see the relation The diversity index is bound between 0 and 1 where a 0- here that urban areas with higher population tend to have value indicates that everyone in the population has the same higher public transit ridership as well as a higher diversity racial and ethnic characteristics. While a value closer to 1 index by county. indicates that everyone in the population has different racial and ethnic characteristics (Bureau et. al. 2021). Therefore, we observe diversity index by county for each of the 449 FTA-defined public transportation agencies, and found it as 68 AAAI Spring Symposium ‘22, March 2022, Palo Alto, California Goolsby et al. Forest), while the class package is being used to develop the k-nearest neighbor algorithm in R. For the models without domain knowledge, 12 columns are being used. The 11 predictors are all numeric values and some of the variables include Primary UZA Population, Total Unlinked Passenger Trips, and Total Passenger Miles Traveled to name a few. These 11 predictors are being used to predict Total Operating Expenses, which serves as a general measure of how much money a specific FTA transportation agency is receiving/spending. The models with domain knowledge have 12 predictors: the same 11 predictors as the models without domain knowledge, plus our variable representing Diversity Index by County employed as domain knowledge. The goal of measuring the accuracy, precision, recall, and ROC performance metrics was to take a trivial look as to if Figure 2: Heatmap of all numeric variables in data set incorporating domain knowledge into some simple classification models will drastically affect those values. As Furthermore, 7 out of the top 10 UZA is from the seen in Table 1, the accuracy, precision, recall, and ROC top 10 diverse states (Jensen et. al. 2021) – Hawaii, metrics are calculated, each of which has a value of 0.99X. California, Nevada, Maryland, District of Columbia, Texas, The metrics with domain knowledge (i.e., after incorporating New Jersey, New York, Georgia, and Florida (Jensen et. al. diversity index as encoded domain knowledge) deviated 2021). only slightly from the metrics produced by the models without domain knowledge. The largest difference between Table 1. Top 10 UZA with the highest ridership metrics of models with and without domain knowledge can New York-Newark, NY-NJ-CT be seen in the K-Nearest Neighbor models. The average Los Angeles-Long Beach-Anaheim, CA difference between models without domain knowledge Chicago, IL-IN minus the models with domain knowledge is 0.00265. This Washington, DC-VA-MD difference is negligible and expected considering the overall San Francisco-Oakland, CA societal impact from it. Boston, MA-NH-RI Philadelphia, PA-NJ-DE-MD Table 2. Accuracy precision, recall, and ROC metrics for Random Forest models w/ and w/o domain knowledge Seattle, WA Random Without Domain With Domain Miami, FL Forest Knowledge Knowledge Accuracy 0.99492 0.99490 Although the diversity index of a county has the highest Precision 0.99488 0.99501 correlation (.26) with the population of UZA, it has a comparatively low correlation (.14) with total operating Recall 0.99492 0.99490 expenses in that area. This finding encourages us to develop ROC 0.99998 0.99998 an equitable distribution technique. Table 3. Accuracy precision, recall, and ROC metrics for Extra Model Creation Trees models w/ and w/o domain knowledge Both the R and Python programming languages are being Extra Trees Without Domain With Domain used to create machine learning models on the dataset. R is Knowledge Knowledge primarily being used to preprocess the dataset while Python Accuracy 0.99619 0.99618 is being used to develop classification models using a 70/30 Precision 0.99617 0.99622 training and test set split. Random forest, extra trees, and k- Recall 0.99619 0.99618 nearest neighbor models without domain knowledge (i.e., ROC 0.99998 0.99998 without considering diversity index) are being developed and analyzed. The scikit-learn package is being used to develop Python-based supervisor learning model (Random 69 Figure 3: Density plot for Primary UZA Population Table 4. Accuracy, precision, recall, and ROC metrics for K- Nearest Neighbor models w/ and w/o domain knowledge K-Nearest Without Domain With Domain Neighbor Knowledge Knowledge Accuracy 0.99111 0.98854 Precision 0.99099 0.98849 Recall 0.99111 0.98854 ROC 0.99952 0.99655 Fairness Evaluation Preprocessing For evaluating fairness in the models with domain knowledge, the IBM AI 360 tool is being used. We use the R package of this tool for our experiment. To begin the Figure 4: Density plot for Total Unlinked Passenger Trips process of evaluating fairness, the data set needs to be converted into a binary representation of itself. The most A very similar idea is being used to split up total unlinked important columns are being chosen to be present in the passenger trips into classes. The National Transit Database fairness evaluation. These variables were deemed the most (NTD) and the FTA provided the explain that unlinked important since they all presented the highest correlation passenger trips are the number of boardings on public with the variable being predicted: Total Operating Expenses. transportation vehicles in a fiscal year for a specific Furthermore, these variables are all numeric values which transportation agency (Federal Transit Administration are imperative to the development of classification models 2021). Transit agencies must count each passenger that that can be evaluated using the IBM AI 360 (AI Fairness 360 boards their vehicles, regardless of how many vehicles the 2021). Considering that all these variables are numeric values, it is much clearer where to set bounds when converting variables to binary representations. This includes the population of the UZA in which transit agencies exist in, total unlinked passenger trips, year, and operating expense level. Since the operating expense level is already broken into classes (Low, Medium, High), a separate column is made for each. For example, there is one column labeled “Operating Expense Level Low”, which has a 1 in this column in the operating expenses are categorized as "Low" and a 0 in every other row. A little more nuance is being taken to convert the UZA population and total unlinked passenger trips columns to binary representations. passenger boards from origin to destination (Federal Transit The density of both these variables shows a heavy Administration 2021). Similar to previous variables, 3 concentration of observations at the lower end (Figures 3 & classes are being created with the following ranges for each: 4). Since both these variables have such many • Low: total unlinked passenger trips [0, 5M) observations near the lower end of the range, the ranges for • Medium: total unlinked passenger trips [5M, 100M) the classes are being chosen to reflect this trend. For UZA • High: total unlinked passenger trips [100M, MAX) population, three classes are being created to split this column into a binary representation. The following is the The Year variable is also being split up into a binary range for each class for the UZA population: representation. The year in this data set ranges from 2014 to 2019. Thus, a separate column for each year is being made - Low: population [0, 250] where a value of 1 means the specific observation is from - Medium: population [250K, 1M) that year. The last variable that is converted to a binary - High: population [1M, MAX] representation is, of course, the diversity index by county. Simply, for this column, a value of 1 is given if the diversity 5 70 AAAI Spring Symposium ‘22, March 2022, Palo Alto, California Goolsby et al. index is categorized as “Moderate” or “High” and a value of distribution of funds between FTA transportation agencies 0 is the diversity index is categorized as “Very Low” or that are based in a county with a high diversity index (>= “Low”. Figure 5 provides a snapshot of the data after all 0.75). By observing these specific fairness metrics, we can variables are done being converted to binary representations. see how favorable outcomes, or higher federal funding, may The data set still has 2,615 rows, however, the binary data be unequally distributed among privileged and unprivileged set has 16 columns. groups. Furthermore, we chose to employ the IBM AI 360 High High Medium Low High Medium toolkit as it provided a compact and efficient collection of Diversity Operating Operating Operating Primary Primary fairness evaluation libraries. The problem are of the project Index by Expenses Expenses Expenses UZA UZA is perfectly encapsulated in the recommended uses of the County Population Population toolkit. The creators of the IBM AI 360 toolkit explain that 1 0 0 1 0 0 the toolkit should be used in very limited settings, one of 0 1 0 0 0 1 which is allocation assessment problems with well-defined 1 1 0 0 1 0 protected attributes (AI Fairness 360 2021). This project’s 0 1 0 0 1 0 1 0 1 0 0 0 problem area deals with allocation of funds. Moreover, and 0 1 0 0 0 1 more importantly, the dataset being used for the fairness 1 1 0 0 0 1 evaluation has a well-defined protected attribute, which is 0 1 0 0 1 0 diversity index by county, since as we have explained earlier 1 1 0 0 1 0 in the paper, has unintentional bias defined by the FTA and 0 1 0 0 1 0 protected by Title VI of the Civil Rights Act of 1964. The reweighing function is our tool of choice in the Figure 5: Snapshot of the binary representation of data IBM AI 360 toolkit as it assigns weights to training set tuples instead of changing class labels (Kamiran et. al. 2012). This Fairness Metrics Calculation is favorable since we want to analyze how diversity index by We create a new R script to calculate the desired fairness county plays a role in the mitigation of bias in this problem. metrics. A simple definition of a fairness metric, as provided In the R environment, the “aif360” library is being in the documentation of the IBM AI 360 tool, is a used, which includes all the metrics and capabilities quantification of unwanted bias in training data or models provided by the IBM AI 360 project. The project library is (AI Fairness 360 2021). The fairness metrics that are being loaded into the R environment and the binary data set from evaluated in this project are statistical parity difference, Figure 5 is also loaded in. To run any metric calculations disparate impact, equal opportunity difference, and the Theil with this library, any R data frames must be converted into index. A brief definition for each observed fairness metric is an aif data set, which asks for the protected attribute, the as follows: privileged (i.e., reference group) and unprivileged value for • Statistical parity difference: the difference in the rate of the protected attribute, and the target variable. For our case, favorable outcomes received by the unprivileged group to the target variable is the “Operating Expense Level High” the privileged group. column. To reiterate, a value of 1 is given in this column if • Disparate impact: the ratio of the rate of a favorable the observation is considered to have “High” operating outcome for the unprivileged group to that of the expenses, or operating expenses of more than privileged group. $1,000,000,000. The protected attribute in this project is the • Equal opportunity difference: the difference of true diversity index by county column that was added as a piece positive rates between the unprivileged and the privileged of domain knowledge. To capture the nature of the protected groups. attribute, the privileged group are observations that have a • Theil index: measures the inequality in benefit allocation value of 0, or “Very Low” and “Low” diversity indices, and for individuals. the unprivileged group are observations that have a value of These four fairness metrics were chosen based on the 1, or “Moderate” and “High” diversity indices. information provided on the IBM AI 360 tool. Furthermore, The IBM AI 360 library uses underlying these four metrics specifically evaluate privileged versus classification models to help develop and calculate fairness unprivileged groups in terms of individual and group metrics. Since the IBM AI 360 library uses classification fairness. Regarding this project, we are looking at the models, we need two data sets to compare the true data with the predicted data. Thus, we have one aif data set that is the 71 raw binary data, and another that is nearly identical, however, the “Total Operating Expenses High” variable was predicted by a simple logistic regression model (this is called the newly classified dataset). The reweighing technique (Kamiran et. al. 2012, Aif360 2021), which modifies the weights of different training examples, is being used to help mitigate any bias that is present in this project. The IBM AI 360 tool includes a reweighing option that modifies the weight of different training instances. The reweigh algorithm is being applied to both the original binary data set as well as the classified data set. Once both data sets are reweighed, the fairness metrics can be calculated and compared to the original data. Graphs are being produced to show the Figure 7: Disparate impact of original vs mitigated data difference and improvement after bias is mitigated through reweighing. Figures 6, 7, 8, and 9 show the comparison of fairness metrics between the original data and the reweighed data that has bias mitigated. Calculating all four desired fairness metrics shows that mitigating bias through reweighing leads to either metrics being the same, or slightly improving the value. As seen in all graphs, both the original data and the mitigated data are within the fair range. Statistical parity difference (i.e., discrimination) was reduced to .035 from .051 using domain knowledge (see Figure 6). Statistical parity, also called demographic parity, ensures each group has an equal probability of being assigned to the positive predicted class. By mitigating bias, we can produce fairness metrics Figure 8: Equal opportunity difference of original vs mitigated that are closer to true fairness, which is a value of 0 for data statistical parity difference, equal opportunity difference, and Theil index, and a value of 1 for disparate impact. Currently, we are infusing the diversity index as domain knowledge. However, in the future, we would also like to investigate the infusible domain knowledge more by examining other criteria such as native language spoken, and family income. Figure 9: Theil index of original vs mitigated data Contributions and Future Works By investigating the implications of domain knowledge on creating fair decision-making, this work explores how true Figure 6: Statistical parity difference of original vs mitigated fairness in AI can be achieved within the application of data public funding allocation. This work investigates how 7 72 AAAI Spring Symposium ‘22, March 2022, Palo Alto, California Goolsby et al. federal agencies like the FTA could apply AI in the process 0.4.0 documentation. (n.d.). Retrieved November 24, of allocating funds. In general, the allocation of FTA funds 2021, from https://aif360.readthedocs.io/en/latest/modules/generated/ corresponds to the population in an area (i.e., UZA). aif360.algorithms.preprocessing.Reweighing.html. However, it is found that areas with a higher diversity index [10] Riegg, S. K. (2008). Causal inference and omitted have higher public transport ridership. Our proposed domain variable bias in financial aid research: Assessing knowledge infused approach can reduce statistical parity solutions. The Review of Higher Education, 31(3), 329- difference which helps to ensure each group has an equal 354. [11] Mustard, D. B. (2003). Reexamining criminal behavior: probability of being assigned to the positive predicted class. the importance of omitted variable bias. Review of Finding the right domain knowledge is very challenging. Economics and Statistics, 85(1), 205-211. Going forward, we want to incorporate and investigate the [12] Clarke, K. A. (2005). The phantom menace: Omitted impact on other protected variables (e.g., native language variable bias in econometric research. Conflict management and peace science, 22(4), 341-352. spoken, family income), and find a way to enhance the [13] Suresh, H., & Guttag, J. V. (2019). A framework for infusible domain knowledge that reduces different understanding unintended consequences of machine disparities. An increase in public transit ridership has the learning. arXiv preprint arXiv:1901.10002, 2. potential to reduce the use of personal vehicles as well as to [14] Omi, M., & Winant, H. (2014). Racial formation in the reduce the carbon footprint. A quantitative analysis of this United States. Routledge. [15] AI Fairness 360. Retrieved November 25, 2021, from possibility could be another direction of research. https://aif360.mybluemix.net/ [16] Federal Transit Administration Office of Budget and References Policy. (2021, December 13). National Transit Database [1] Giorgis, J. D. (2020). Federal Funding Allocation 2021 policy manual. Retrieved January 24, 2022, from [Dataset]. United States Department of https://www.transit.dot.gov/sites/fta.dot.gov/files/2021- Transportation- FTA Federal Funding Allocation Since 12/2021-NTD-Reduced-Reporting-Policy-Manual_1- 2014, from https://catalog.data.gov/dataset/federal- 1.pdf funding-allocation [2] Title VI of Civil Rights Act of 1964, Pub.L. 88-352, 78 Stat. 241 (1964). [3] Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2021). A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR), 54(6), 1-35. [4] Benthall, S., & Haynes, B. D. (2019, January). Racial categories in machine learning. In Proceedings of the conference on fairness, accountability, and transparency (pp. 289-298). [5] Mallett, W. J. (2018, March). Trends in public transportation ridership: Implications for federal policy (No. R45144). Congressional Research Service. [6] Bureau, U. S. C. (2021, October 14). Racial and ethnic diversity in the United States: 2010 census and 2020 census. Census.gov. Retrieved November 22, 2021, from https://census.gov/library/visualizations/interactive/racial- and-ethnic-diversity-in-the-united-states-2010-and-2020- census.html [7] Jensen, E., Jones, N., Rabe, M., Pratt, B., Medina, L., Orozco, K., & Spell, L. (2021, August 12). The chance that two people chosen at random are of different race or ethnicity groups has increased since 2010. Census.gov. Retrieved November 24, 2021, from https://www.census.gov/library/stories/2021/08/2020- united-states-population-more-racially-ethnically-diverse- than-2010.html [8] Kamiran, F., & Calders, T. (2012). Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems, 33(1), 1-33. [9] Aif360 Algorithms Preprocessing Reweighing. aif360.algorithms.preprocessing.Reweighing - aif360 73