=Paper=
{{Paper
|id=Vol-2622/paper7
|storemode=property
|title=Big Data Seeks Value, Make it Ethical
|pdfUrl=https://ceur-ws.org/Vol-2622/paper7.pdf
|volume=Vol-2622
|authors=Elie Chamoun,Nour Charara
|dblpUrl=https://dblp.org/rec/conf/bdcsintell/ChamounC19
}}
==Big Data Seeks Value, Make it Ethical==
Big Data Seeks Value, Make it Ethical
Elie Chamoun Nour Charara
Computer Science department Computer Science department
American University of Culture and Education AUCE American University of Culture and Education AUCE
Beirut, Lebanon Beirut, Lebanon
elie.chamoun@ul.edu.lb nourcharara@ul.edu.lb
Abstract— Over decades’ data sets became the lifeblood of of the data sent and received from various sources) with a
business or the new oil of the economy and driving the change. total of 10Vs.
Accordingly, nowadays data has become more critical and
known as Big Data. Compared to traditional datasets IBM (International Business Machines) data scientists
(formatted with titled fields), Big Data typically includes break Big Data into four dimensions: Volume, Variety,
unstructured masses of data that require real-time analysis. In Velocity and Veracity [4]. As remarked, the element Value
addition, Big Data originates new opportunities for discovering has been removed from the five original dimensions.
new experiences and values, helps us gain a deeper Subsequently, Big Data is unethical without a Code of Ethics
understanding of hidden values, incurs new challenges and like Value and, therefore, there would be no need to argue
discover answers on how to organize and effectively manage about the right of Privacy, Security and Policies.
values for these massive and complex datasets. Therefore,
owners of industries and business men have become interested
The frequently cited features of "volume, speed, and
in the potential of Big Data; tackle challenges by leveraging the variety" are useful benchmarks - persistent features such as
power of advanced technologies (Artificial intelligence, the size of the datasets, the speed at which they can be
Healthcare, Smart City, Banking and Finance, Oil & Gas, Data acquired and queried, and the wide range of file formats and
Mining. Internet of Things) engaging in an ethical debate types that generate data.
(Specialists, Individuals) to build a better world. The Value element, in Big Data, is important enough to
Ethical debates are typically articulated within the context of pose practical rather than theoretical problems in computer
ethical theories. An ethical theory is framed and reviewed in ethics. Hence, privacy breaches arise in some actions taken
this paper. This theory, named SAS Theory, categorized the by businesses as an outcome of big data analytics and lead to
Value (ethical wise and not as money revenue) in Big Data to embarrassment and even lost jobs of those involved.
three functions. (Sustain, Align, Support). These functions
facilitate and stimulate the ethical behavior of Big Data to turn The thin line dividing between new data and old data in
Values into Actions. Furthermore, this theory outlines the Big Data is related to the development of technology. New
ethical behavior of the user personas and organizations offline developments in this space make old privacy issues and other
and online. It also examines and shows how it might be ethical issues much more pressing. Comparing the cell phone
engaged to conserve the high ethical service in some industries ten years ago and now, it can record conversation, take
and technologies. Finally, this theory presents three functions photos and videos and stream directly online.
that group the technologies in categorized properties using a
specific parameter which is the Value Personas. Categorizing Moreover, Big Data is «Ethically Neutral» [5]. This
means to assign a specific ethical solution for a given means that Big Data does not include an integrated
technology or business. perspective on what is good data or bad data upon its
creation, or where is the good or bad usage of this data
Keywords — Big Data, Ethics, SAS Theory, Value, Value without analyzing it.
Personas.
I. INTRODUCTION
However, Big Data generates a "forcing function" in our
Big Data is known as extreme large data sets. The lives by its size and speed. Today, millions of people want to
features of Big Data have been classified according to five share the same information (flash news for example) with
fundamental elements (5Vs), which are Volume (size of each other. This is a direct example of how the "forcing
data), Variety (different types of data from several sources), fun ction" 1of Big Data literally influences our lives.
Velocity (data collected in real time), Veracity (uncertainty
of data) and Value (benefits to various industrial and Influence is a two-way street, just like the scientific
academic fields). [1] principle, we cannot observe a system without changing it.
Big Data cannot be used without impact. This impact is
Clark [2] indicate that 3 elements have been added to the where ethical issues live. Big Data can amplify our values,
basic 5 elements: Visualization (interpretation of data and
identification of the most relevant information for the users),
Viscosity (latency data transmission between the source and
destination), and Variability (context of data). 1
“A forcing function” is an aspect of a design that prevents the user from
Other researchers, like Manogaran [3], introduce taking an action without consciously considering information relevant to
additional characteristics beyond the 8Vs model, such as that action.” It forces conscious attention upon something ("bringing to
consciousness") and thus deliberately disrupts the efficient or automatized
Validity (correct processing of the data) and Virality (speed performance of a task. [6]
Copyright © 2019 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
46
making them much more powerful and influential, especially On a technological level, businesses need to work in
when they are collected and focused on a specific desired ethical frameworks - such as the IEEE P7001 -
outcome (Commenting and disliking a bullying act found on Transparency in Autonomous Systems – [11], but what
the net). about data? More machines learn to go solo. Does one have
As the forcing function of Big Data pushes data in our to make sure that the inner uploaded data, with which they
organizations and in individual lives, the balance between operate, are clean and impartial? For example, could a smart
risk and innovation will continue to be an urgent need to computer run the defense for a given country?
meet and maintain the ability of Big Data to generate The challenge is how to honor these values in our daily
benefits from values rather than prejudices. actions decision.
II. BALANCING BIG DATA AND ETHICS C. The ethical decision points
A. Big Data tends to be a wide-ranging category Davis and Patterson define ‘Ethical decision points consist
One approach is balance between risk and innovation. An of a series of four activities that form a continuous loop’
article in CBS News regarding innovation titled "Predictive [12].
Policing" program that can actually predict where crimes will
happen. The crime prediction boxes come from the same a) Review: the discovery and discussion in
kind of mathematical calculation used to predict earthquakes fundamental organization values is in demand. An
and aftershocks. Los Angeles Police Chief Charlie Beck says understanding of what values actually are not what
«The real measure of this is not how many people you catch it is estimated or other thinks. No jump to solutions
it is how much crime you prevent». [7] without first identifying the ethical issue(s) in the
situation. Example: how to value transparency in
Massive data represents tremendous opportunities for the the use of Big Data.
benefit of business, education, health care, government
sector, manufacturing, and many other areas. The risk, b) Analysis: review of current ethical situation and
however, for privacy, is the ability to manage our reputation gather the facts, actual data-handling practices and
and online identity. What it might mean to lose or gain an assessment of how well these facts align with
ownership of our personal data is just the beginning of the core organizational values. The exploration of
ethical issues. To take advantage of the benefits of Big Data whether a particular use of Big Data technology
innovations, we need to understand the practical risks of aligns with the values that have been identified
implementing them. Example: Should they create this new product
Another approach is the anonymization of data sets feature using Big Data?
before they are published, targeted advertising, and so on. As c) Express: a simple, clear written expressions of
the lawyer Paul Ohm [8] points out, «the data can be useful where, when and how identified values and actions
or perfectly anonymous, but never both». So, let's suppose
align—and where they don’t—using a common
we know things about a person in particular: where did he
eat, what did he eat? It is very unlikely that we will end up vocabulary for discussing. What specific virtues
violating his privacy by broadcasting "the information" that a are relevant in the situation? Example: This new
particular person likes Pizza House and Pizza. Suppose we product feature that uses big-data technology
have this information for about 10 million of peoples, supports value of transparency?
patterns emerge that do make it possible to tie data points to d) Action: tactical plans to close alignment gaps that
particular named, "located individuals". have been identified and to encourage and educate
Big Data tends to be a wide ranging technological on how to maintain that alignment as conditions
category like Artificial Intelligence, Data mining, change over time Example: If we build this new
globalization, medicine and others. The task is what if we product feature, we must explicitly share (in a
can categorize these technologies using functions and transparent way) with our customers and ourselves
parameters. Where would Big Data stand regarding ethics? on how that feature will use personal data.
B. Ethics and Big Data These four activities are validated by three functions where
As mentioned before, Big Data itself, like any ethics can converge with Big Data.
technology, is ethically neutral. The use of Big Data is not
the case [9]. Although the ethics involved are abstract
concepts, they can have very real implications. The goal is
to develop better ways to engage in an intentional ethical III. CONVERGING BIG DATA WITH ETHICS
inquiry to inform and align our actions with our values.
How could this alignment be reflected? A discussion of ethics and Big Data depends on how people
define ethics. In general, ethics involves analyzing the
There is a significant amount of effort to create a digital conduct that may be beneficial or harmful to other people
«Bill of Rights» [10] for the acceptable use of Big Data. The [13]. However, ethics is a subject that has been studied for
White House recently released a bill on the protection of at least 2,400 years, since then there have been a number of
consumer rights. The values which that bill supports include formulations of ethical principles (like Hippocrates 460
transparency, security and accountability. BCE “revered for his ethical standards in medical practice”
[14]). Solid ethical theories share common property. They
47
allow the individual to convince using logical and reasoned b) Align: Values can be aligned intensely with Big
arguments based on the principles defined by ethical theory. Data. This function is applicable on technology and
To illustrate this, the ethical theories will outline the ethical organization more than individuals.
behavior of the user personas and organizations offline and
online to conserve the best ethical service. Simultaneously, − Challenge- While it is true that an organization
this theory helps to frame our understanding for values and may hold conflicting values; this can lead to
moral issues than examined and show how it might be contradictory actions. This conflicting issue
engaged to fix Big Data values in some industries and could rise in some technology, like medicine.
technologies. Ethics for this career have been generated
before the rise of Big Data concept. For
While the nine elements mentioned earlier are related Example, changing the data-handling policy
directly to data (Volume, Variety …etc.), Value(s) «are for an organization or product without
things that people care about» [15]. Thus Values are notifying anyone means it is not acting in
connected to fundamental human and personal aspects. alignment with its values.
A. The Value Personas: − Solution-The Value Personas can help make
Patterson defines the Value Personas [16] as an those conflicts transparent. Value Personas can
«evolution of traditional user personas» that express how a help analyze the conflicts between what we
specific value shows up and influences action within an value and how we should act based on those
organization or even society. Value Personas shed light on values.
moments when the use of Big-Data technologies raises an c) Support: Values are not ethics. Ethics are derived
ethical (or value-focused) decision point. A Value Personas from values. This function illustrates values as a
can suggest options for how to align shared values with “stone” holding Big Data.
proposed action from various organizational role
perspectives. − Challenge- The word “Ethics” is an expression
of which action is valued and which action is
B. The Three Ethical Functions in Big Data: not. Values measure whether an action is
Valuable personas provide a means to frame an explicit ethical or not.
ethical inquiry and are very flexible and modifiable in many − Solution –The Value Personas is the ‘key’ by
given ways. How? The Valuable personas play the which ethical alignment can be measured. In
parameter in functions presented in "Values". In this paper, practice, the weight of this ‘key’ starts with
the Valuable personas deliver an ethical behavior which ethical information literacy in education.
appears in the three functions that control Big Data. This Courses about computer ethics need to be
leads us to the SAS (Sustain, Align, and Support) theory: included in education programs, workshops
a) Sustain: this function is generated when ethics are and conferences. In practice, ethics start in
set upon creation of big data, where the ethical education. Additionally, public libraries can
decision should be taken instantly. serve technical support for those who need
answers regarding ethical issues for students
− Challenge- Our ethical values are inherent in and other attendees by offering an online
our actions all the time. The ‘Sustain Function’ service and seminars.
shows up in online technology while surfing
the net and using social media, or in some These functions are useful for understanding how
technologies like Artificial intelligence and technologies are classified in Big Data properly and the
robotics. importance of the Value Personas as a parameter or ‘Key’.
− Solution- Values can’t take action; persons do. IV. APPLYING ETHICAL T HEORIES FUNCTIONS TO BIG
Valuable Personas can help analyze the DATA
conflicts between what you value and how you The usage of these ethical perspectives described above is
should act based on these values. The useful for understanding how issues are revealed in Big
behavioral response should be spontaneous to Data ethics. However, it is possible to better understand how
take action. That means only when taking and why ethics help to shed light on a problem such as Big
action online, for example, the ethical and non- Data ethical concerns.
ethical value shows up. The advantage of
Value Personas is identifying which values are A. Role of Value Personas
showing up in actions and how. As a key, the Value Personas help clear these values.
The Value Personas can provide a mechanism for Following organizational actions, Value Personas provide a
developing a common vocabulary based on and inspired by way to facilitate discussion about organizational alignment
the own personal moral codes, and aims at developing a set in actions, business practices and individual behaviors based
of common and shared values, which help to reduce barriers on common set of values. Value Personas provide a
on line and could inspire collaboration to productivity and description of key roles, ethical decision points, alignment,
innovation. actions and anticipated outcomes. Value Personas help
identify shared values and create a vocabulary for explicit
48
dialogue, reducing the risk of misalignment (produced by However, even in an incident of disagreement over core
some politics in an organization) and encouraging values, Value Personas can generate a tool to start
collaboration and innovation between the team works in an productive ethical conversations. This conversation could
e-type (e-business, e-HR …) technology. begin about A.I. that represents a totally different
intelligence from ours as humans.
Value Personas may either evolve regularly based on
changes in market conditions, new technologies, legislation, These conversations become productive when made
common practices, or evolutions in business model, or may transparent and explicit. Value Personas can help as a tool.
stay essentially intact and unchanged for long periods. The The goal is clear, and divergent actions of plan are required.
dynamics of the conditions where business decisions are One must take into account each person’s role involved, and
made are highly variable and subject to influences that are define exactly what each person will do, in what order, and
often difficult to expect. Value Personas can be expected to highlight, predicts the results or objectives for each action.
be used in the process of being able to be revised, updated, McEwan argues "A.I. becomes responsible for helping in
and adjusted. their design and their generation” [18]. The real challenge is
that these designs might have real problems understanding
B. Turn discussion into action
each other. An argument question the Value Personas have
Technologists are constantly working to change the to deal with.
capabilities of Big Data. Social norms and legislation evolve
Therefore the products, features and services, should
more slowly. Competitive market forces, depending on the
make the organization more beneficial for all its users with
sector, can evolve at many different rates, instantaneously,
one guiding principle: Is what they’re offering useful in
quarterly or annually. There is no reason to hope that the
ensuring ethical action?
ability to maintain alignment between values and actions
can be fully expressed in advance of all commercial C. Standard ethical models an organization should earn
conditions. Indeed, one of the advantages of the innovation
There are four critical Values that could establish standard
opportunities offered by Big Data is that it allows
models for Big Data in organizations:
companies and organizations to adapt quickly to these
market forces and competition. a) Privacy Vs secrecy: Privacy doesn’t always mean
secrecy. Ensuring data confidentiality means
The Google Code of ethics Conduct is an example. This
defining and enforcing information rules - not just
code is one of the ways where Google put values into
rules for collecting data, but also for their use,
practice, with a cliché «Don't be evil». This code is built
maintaining and keep. Data owners must have the
around the recognition that everything they do in connection
ability to manage the flow of their private
with work at Google will be, and should be, measured
information on massive third-party analytical
against the highest possible standards of ethical business
systems
conduct [17]. In this code, Google expects from its
employees and Board members to know and follow the b) Although private information generated and shared
Code of Ethics. Moreover, Google is ready to answer, thru it can still remain confidential: It is unrealistic to
the Ethics & Compliance Helpline, for any question from its think about information as secret or shared, totally
employees regarding a concern of a suspected violation of private or completely public. For many reasons,
the Code or any other Google policy. Finally, if any data (and metadata) are shared or generated by
employee believes a violation of law has occurred he can design with trusted services (e.g. address books,
always raise that to the Ethics & Compliance Helpline or images, GPS, cell tower, and Wi-Fi localization of
with contact a government agency. This is an example on our cell phones). But it's not because we're talking
how organizations can “create and develop” a Value about medical data, financial data, address book
Personas. data, location data, reading data, or other things
that we share the generated information.
The ability to align values with actions allows
organizations to create a common and shared sense of action c) Transparency in Big Data: Massive data is
and purpose about any given business initiative. What powerful when secondary uses of datasets produce
matters, ethically speaking, is turning the question of new predictions and readings. Obviously, this leads
"should we do this" into "how can we do this" and to commercial data, with people like data brokers,
eventually free more thinking and collaborative work. collecting massive amounts of data about data
owners (in general are clients), often without their
Value Personas, as a tool for developing this capacity,
are inherently evolving. Comprehensive, multi-day knowledge or permission, and shared
unexpectedly. For Big Data to work in ethical
workshops can give organizations enough time to develop
an awareness regarding their values and articulate suggested terms, data owners (people which create and
process data) need to have a transparent view of
actions at various ethical decision points. However, these
methods and tools work equally well in conferences how their data is used or sold
conversations or in an informal meeting where ethical issues d) Big Data able to compromise identity. Privacy and
suddenly arise and ethical discussions begin. secrecy protections are not enough anymore. Big
In general, organizations have a set of ideal core values Data analysis can compromise identity by allowing
to provide a starting point for the less formal use cases. official organization and surveillance institutional
49
oversight and even determine who we are before The use of viable ethical theories helps to better describe our
we make our own opinion. Big Data owners need problems with Big Data based on a set of clearly articulated
to start thinking about what kind of Big Data moral values.
predictions and inferences they would allow and
In our daily lives, Big Data becomes a main force. It
which ones they should not.
affects what others know about us and vice versa, as well as
D. Sighting the Qualitative. how we act because of the information they share with us.
Besides our contribution, like it or not, the tools of this force
With the help of the social media technologies, the data
like cameras, sensors surround us and we use it.
created by users are regenerated by developers and analysts,
reducing the human experience to limited set of quantitative By examining this ethical theory, we can better
variables. The capitalist modernity is the source that shapes recognize converging and diverging views on moral data,
the world to numbers with statistical analysis. Therefore, better understand the context and logic of the arguments
this paper suggested the needs for reflection on the presented and, in doing so, better assess how the future
qualitative nature of ethics in Big Data regarding sequence of action is or should be justified.
quantitative heterogeneous arithmetic large datasets.
This general conclusion sheds light on the use of Big
E. Turning action to law Data. It also unlocks the way for finding ways to simplify
ethical gaps. Big Data is here to stay, to value ethics with
There is a lot of work to be done to translate the S.A.S.
results facilitating advances in the given technologies,
theory with its three functions (Sustain, Align, and Support)
artificial intelligence, medicine, globalization, to name a
into laws and rules that will lead to an ethical management
few. The objective of this paper is to categorize these
of Big Data. Moreover, we must certainly develop more
technologies in functions and assign the right Value
principles by building more powerful technological tools.
Personas parameters. The positive ethical usage results
Every person involve in Big Data management should
provide the balance point that supports the use of Big Data.
engage in the ethical discussion of how Big Data is used.
Moreover, the use of ethical theories, such as SAS, aids us
Database developers and administrators are at the forefront
better to recognize and manage how Big Bata affects in our
of the issue. The law is a powerful part of the Big Data
lives.
ethic, but it is far from being able to handle the many use
cases and nuanced scenarios that arise. Organizational REFERENCES
principles, institutional ethics statements, self-monitoring
[1] J. W. B. S. S. J. Shilpa G. Kolte, "Big Data Summarization :
and alternate forms of ethical guidelines are also needed. Framework, Challenges and Possible Solutions," Advanced
The technology itself can help provide an important element Computational Intellegence: An International Journal (ACII),
of the ethical mix. vol. 3, no. 4, pp. 1-9, October 2016.
[2] D. Clark, "5 Things You Need to Know about Big Data," KD
How? This could be in form of an Intelligent Data Nuggets, 03 March 2018. [Online]. Available:
Tracer (IDT) that would tell us how our data is being used https://www.kdnuggets.com/2018/03/index.html. [Accessed 31
and would allow us to decide whether or not we want our October 2019].
data to be used in the analysis that takes place beyond our [3] Gunasekaran Manogaran et al., "Big Data Knowledge System
spheres of awareness and control. We also need clear rules in Healthcare," in Internet of Things and Big Data
Technologies for Next Generation Healthcare, Springer
to determine (by default) what types of personal data International Publishing, 2017, pp. 133-157.
processing are allowed and what types of decisions based on [4] "Infographics & Animations," IBM Big Data & Analytics Hub,
these data are acceptable especially when data affect 2018. [Online]. Available:
people's lives. https://www.ibmbigdatahub.com/infographic/four-vs-big-data.
[Accessed 31 October 2019].
But the important point is this; first we need a great data
ethics rules second a Value Personas to be at the center of [5] K. D. Doug Patterson, "Big Data, Big Impact," in Ethics of Big
Data, O'Reilly Media, Inc, 2012, p. 64 p..
these critical ethical discussions and keep in mind Big Data
ethics is for everyone. [6] Bill Papantoniou, "The Glossary of Human Computer
Interaction," Interaction Design Foundation, 2002. [Online].
V. CONCLUSION Available:https://www.interaction-
design.org/literature/book/the-glossary-of-human-computer-
The ethical contexts described in this paper examine the interaction/forcing-functions. [Accessed 30 Nvember 2019].
ethical behavior of Big Data to turn Values into Actions [7] B. Orr. LAPD computer program prevents crime by predicting
using functions (Sustain, Align, and Support) and it. April 11, 2012. https://www.cbsnews.com/news/lapd-
parameters like Value Personas described in the SAS theory. computer -program-prevents-crime-by-predicting-it/. [Accessed
31 October 2019]
What makes ethics actions so valuable is that it helps
formulate arguments about what is right or wrong using [8] P. Ohm Broken promises of privacy responding to the
surprising failure of anonymization 2010
logical and coherent opinions. This can help to evaluate and https://www.uclalawreview.org/pdf/57-6-3.pdf [Accessed 31
understand that the usage of Big Data is morally right. October 2019]
Ethical feasible theories take all people, other than the [9] D. Boyd and K. Crawford, "Critical Questions for Big Data:
decision maker, into consideration. It assumes that moral is Provocations for a Cultural,Technological, and Scholarly
good and moral principles are objective and based on Phenomenon.," Information, Communication, & Society, vol.
reasoning facts and common maintained values. [19] 15, no. 5, pp. p. 662-679, 2012.
[10] D. Weitzner "We Can’t Wait: Obama Administration Calls for
50
A Consumer Privacy Bill of Rights for the Digital Age," The [16] K. Davis and D. Patterson, Ethics of Big Data, Beijing,
White house president Black Obama, 23 February 2012. Cambridge, Farnham, Koln, Sebastopol, Tokyo: O'Reilly,
[Online].Available: 2012, p. 57 p..
https://obamawhitehouse.archives.gov/blog/2012/02/23/we-
can-t-wait-obama-administration-calls-consumer-privacy-bill- [17] "Google Code of Conduct," Alphabet Ivestor Relations, 31 July
rights-digital-age [Accessed 18 october 2019]. 2018.[Online]. Available: https://abc.xyz/investor/other/google-
code-of-conduct.html. [Accessed 26 October 2019].
[11] IEEE Standards Association, "IEEE Announces Standards
Development Project to Address Transparency of Autonomous [18] B. Walsh Ian McEwan on His New Novel and Ethics in the
Systems," IEEE Standards Association, 2016. Age of A.I. May 20,2019 https://onezero.medium.com/ian-
mcewan-on-his-new-novel-and-ethics-in-the-age-of-a-i-
[12] K. Davis and D. Patterson, Ethics of Big Data, Beijing, f30ec47bac72 [Accessed 31 October 2019]
Cambridge, Farnham, Koln, Sebastopol, Tokyo: O'Reilly,
2012, p. 22 p.. [19] M. J. Quinn, Ethics for the Information Age, 7th edition ed.,
Pearson Education, 2016, p. 544 pages.
[13] D. Smith, "Five principles for research ethics : Cover your
bases with these ethical strategies.," Monitor on Psychology,
vol. 34, no. 1, p. 56, January 2003.
[14] W. D. Smith, "Encyclopedia Britanica," 2018. [Online].
Available:
https://www.britannica.com/biography/Hippocrates#accordion-
article-history. [Accessed 05 January 2019].
[15] Critical thinker academy, "What are Moral Values?," How to
Build a Compelling Moral Argument, 2018. [Online].
Available: https://criticalthinkeracademy.com/courses/moral-
arguments/lectures/659294. [Accessed 22 January 2019]..
51