Data Science Ethics: Case Studies

16 min readFeb 23, 2021

Sami Wurm

CSCI 240

Case Studies

Case Study One

Question 1.1: What ethically significant harms, as defined in Part One, might Fred and Tamara have suffered as a result of their loan denial? (Make your answers as full as possible; identify as many kinds of possible harm done to their significant life interests as you can think of).

As described in part one, I believe that Tamara or Fred might say that the algorithm “[left] my body physically intact but my reputation, savings, or liberty destroyed.” I believe that their loan denial stripped them of their autonomy, first and foremost. They had excellent credit, background, set-up, etc… to make their dreams come true, yet all of the work that they put in was fruitless because the algorithm cast it aside. Furthermore, the algorithm put their privacy and security at risk by pulling countless invasive facts from their lives that they were not knowledgeable of or consenting to sharing. This also raises an ethical question of transparency and fairness, as no one working with Tamara and Fred could share what the algorithm was considering in its decision making or whether its considerations were fair.

Question 1.2: What sort of ethically significant benefits, as defined in Part One, could come from banks using a big-data driven system to evaluate loan applications?

In using big-data driven systems, banks could potentially increase economic efficiency, personalization and predictive accuracy, and human understanding. By linking together a multitude of qualities and traits of people who are likely to pay off a loan, these systems could uncover new patterns and connections that we were previously unaware of. Furthermore, by creating an ‘accurate algorithm’ that predicts who is likely to pay off a loan/where it is likely for a loan to be paid off/etc., the banks can save time and money by avoiding giving out loans that will not be paid. Finally, these systems will benefit those who, due to personalization, fit the profile of individuals who the algorithm believes to be likely to pay off a loan with predictive accuracy geared towards them.

Question 1.3: Beyond the impacts on Fred and Tamara’s lives, what broader harms to society could result from the widespread use of this particular loan evaluation process?

Beyond the impacts on Fred and Tamara, these systems could cause widespread harm on society such as: class immobility, systemic racism, systemic sexism, and ableism. If these systems strictly adhere to rules that mark people with certain racial background/sex/disability/medical history/class as being ‘high-risk,’ then these groups of people will be ultimately rendered immobile in society as they will be unable to break their glass ceilings and gain money/status despite their unique histories and qualities.

Question 1.4: Could the harms you listed in 1.1 and 1.3 have been anticipated by the loan officer, the bank’s managers, and/or the software system’s designers and marketers? Should they have been anticipated, and why or why not?

I believe that these harms could have definitely been anticipated by the software system’s designers and marketers. I don’t know that the loan officers or bank managers know what’s going on, but I believe that they should. I also feel that these issues are avoidable by simply not having the algorithm view unrelated characteristics like race/sex/medical history. The algorithms cannot hold biases based on knowledge that they are unaware of, and simply put, I don’t see the relevance of these facts outside of economic profile in a decision to give out a loan.

Question 1.5: What measures could the loan officer, the bank’s managers, or the employees of the software company have taken to lessen or prevent those harms?

Other than changing the information that the system has, the people involved in this loan deal could have ameliorated the situation by implementing/demanding more transparency in the system. This way, after the decision was made, they would see what went into the algorithm’s decision making process and they could course correct/tweak the algorithm if it behaved arbitrarily or problematically.

Case Study Two

Question 2.1: Of the eight types of ethical challenges for data practitioners that we listed in Part Two, which two types are most relevant to the Facebook emotional contagion study? Briefly explain your answer.

ETHICAL CHALLENGES IN APPROPRIATE DATA COLLECTION AND USE and
UNDERSTANDING PERSONAL, SOCIAL, AND BUSINESS IMPACTS OF DATA PRACTICE

I believe that these two types of ethical challenges for data practitioners are most relevant in the FB emotional contagion study because 1. FB did not make users aware of/consent to participating in the study. They used a vague wording in a user agreement that is rarely read. And 2. FB did not consider or understand the social impacts of their study on individuals who are minors or who are suffering from mental illness/predisposition to emotional distress.

Question 2.2: Were Facebook’s users justified and reasonable in reacting negatively to the news of the study? Was the study ethical? Why or why not?

YES! The study was unethical. The study caused mass harm to users’ health that was immeasurable and cannot be taken back/ameliorated. The users did not consent and have no clue how/why their data is being shared or how they can learn more about or have proprietary use over their own information. FB had no team/individual ready to take responsibility for the harm that their unethical behavior caused and instead contested it. Furthermore, the study caused mass controversy and users felt taken advantage of, which is a sign of unethical abuse within itself. The study stripped users of autonomy over their emotions and that, to me, is unforgivable.

Question 2.3: To what extent should those involved in the Facebook study have anticipated that the study might be ethically controversial, causing a flood of damaging media coverage and angry public commentary? If the negative reaction should have been anticipated by Facebook researchers and management, why do you think it wasn’t?

I feel that they definitely should have anticipated the study would be controversial. I am honestly shocked that they did not. I think that it wasn’t anticipated because of their lack of HUMAN ACCOUNTABILITY IN DATA PRACTICES AND SYSTEMS. They clearly have no team specifically designated to deal with ethical issues and think about the different ways that their data will affect various populations. Thus, they may have fallen victim to the “problem of many hands” where everyone was contributing to different parts of this study, but no one was considering the overall effects.

Question 2.4: Describe 2 or 3 things Facebook could have done differently, to acquire the benefits of the study in a less harmful, less reputation-ally damaging, and more ethical way.

Be transparent. Ask users to participate first. Offer some form of compensation/appreciation for participating.
Provide accessible information about how/why/what information is being taken, used, and shared from users.
Reflect on the ethical/long-term harm that may be done on participants. Reflect on what problem the answers from this data will actually solve. Ask: Hm, why are we actually doing this? Is it worth it?

Question 2.5: Who is morally accountable for any harms caused by the study? Within a large organization like Facebook, how should responsibility for preventing unethical data conduct be distributed, and why might that be a challenge to figure out?

That is a challenge to figure out because SO many people contribute to the problem and are complacent in the development/production of a study like this. However, I believe that Zuckerberg should be held accountable. With a huge study like that, which could (did) result in public backlash toward the entire company, Zuckerberg should have caught this at least for his own selfish purposes and put a stop to it. However, I believe that this should be the case with their current working model. In an ideal world, there would be an ETHICS TEAM in charge of this who would take responsibility for ethical fallacies like this. Any financial repercussions should be taken from the company’s profit.

Case Study Three

Question 2.6: Of the eight types of ethical challenges for data practitioners that we listed in Part Two, which types are most relevant to the word embedding study? Briefly explain your answer.

VALIDATION AND TESTING OF DATA MODELS & ANALYTICS and
IDENTIFYING AND ADDRESSING ETHICALLY HARMFUL DATA BIAS

In the word embedding study, the developers did not consider the ethical harms that could be caused by inadequate validation/testing of their model before deploying widespread use of it, and they did not consider the consequences of applying it for unjust outcomes. Furthermore, they failed to at all acknowledge the unjust human biases that contributed to their model and they did not distinguish between the forms of bias that they did and did not want to influence their study. Overall, they did not consider how their tool could perpetuate and exacerbate harmful human biases.

Question 2.7: What ethical concerns should data practitioners have when relying on word embedding tools in natural language processing tasks and other big data applications? To say it in another way, what ethical questions should such practitioners ask themselves when using such tools?

What stereotypes and linguistic biases should be accounted for?

Are their differences in dialects/location of use that we should consider?

Are we only representing a certain economic/educational class in our word vectors?

Are we only representing a certain race/religion/ethnicity/nationality in our word vectors?

Is our language team diverse?

Question 2.8: Some researchers have designed ‘debiasing techniques’ to address the solution to the problem of biased word embeddings. (Bolukbasi 2016) Such techniques quantify the harmful biases, and then use algorithms to reduce or cancel out the harmful biases that would otherwise appear and be amplified by the word embeddings. Can you think of any significant trade offs or risks of this solution? Can you suggest any other possible solutions or ways to reduce the ethical harms of such biases?

This approach is very subjective. The algorithms that the researchers use to quantify the harm of biases could, themselves, be biased. Then only the biases that said researcher is worried about are taken into account. Instead, I would suggest excluding sources in the development of the world tools that include biases and perhaps train them on more basic language tools. Or, could we exclude race and sex from the algorithm? I’m not sure of the perfect answer. Or of the necessity of this tool in our world at this moment.

Question 2.9: Identify four different uses/applications of data in which racial or gender biases in word embeddings might cause significant ethical harms, then briefly describe the specific harms that might be caused in each of the four applications, and who they might affect.

If a woman is not associated with job/salary, then an algorithm could exclude her from getting into the competitive market for a job and strip her of her career. This perpetuates systemic sexism and stereotypes of women not working.
If certain races are not associated with living in certain tax brackets/neighborhoods, then individuals could be turned down to live in certain areas/apply to certain real estate. This would perpetuate systemic racism and redlining.
If the tool expects different/more traditionally EXTRA nice wording from women, it could rate customer interactions with women to be worse if they speak the same way a man would, and label her ‘aggressive’ or ‘bitchy’ where it would label a man ‘a leader’ or ‘confident.’ This prevents women from standing up for themselves and acting in their best interest.
If a tool assumes individuals of a certain race to like specific, biased ad content, then those individuals will be dug into a hole of the content that they consume and not be able to discover/pursue all of their possible interests. This prevents our society/members of our society from growing and progressing.

Question 2.10: Bias appears not only in language datasets but in image data. In 2016, a site called beauty.ai, supported by Microsoft, Nvidia and other sponsors, launched an online ‘beauty contest’ which solicited approximately 6000 selfies from 100 countries around the world. Of the entrants, 75% were of white and of European descent. Contestants were judged on factors such as facial symmetry, lack of blemishes and wrinkles, and how young the subjects looked for their age group. But of the 44 winners picked by a ‘robot jury’ (i.e., by beauty-detecting algorithms trained by data scientists), only 2% (1 winner) had dark skin, leading to media stories about the ‘racist’ algorithms driving the contest. How might the bias have gotten into the algorithms built to judge the contest, if we assume that the data scientists did not intend a racist outcome?

Historical depictions of beauty are white. Our society/world depicts important figures/paintings/prophets (Jesus) to be white. Even if the developers did not intend for the algorithm to behave this way, the human bias that lighter skin is more beautiful was bound to be embedded in the algorithm if not consciously corrected. Furthermore, perhaps some of these ‘objective’ beautiful features that we talk about such as symmetry, young face, etc… are more often white features. I would like to know how they decided upon the standards that they checked participants for.

Case Study Four

Question 3.3: What specific, significant harms to members of the public did the researchers’ actions risk? List as many types of harm as you can think of.

The researchers’ actions risked many harms including but not limited to:

Outing those in the LGBTQIA+ community and potentially putting them in danger
Exposing personal opinions such as political beliefs to one’s family/community where it may not be safe for that to be known
People’s religious beliefs and affiliations putting them at risk of targeted crimes
Individual’s location and age easily tracked by potential predators
Exposing people’s personal opinions/histories that could put them at risk of being employed
Exposing people for drug use, putting them at risk for getting jobs/being targeted
Exposing people for cheating
Exposing people for lying about educational/life background
Violation of NDA info?
Violation of users’ trust in OkCupid as a platform

Question 3.4: How should those potential harms have been evaluated alongside the prospective benefits of the research claimed by the study’s authors? Could the benefits hoped for by the authors have been significant enough to justify the risks of harm you identified above in 3.3?

I believe that the study’s authors had selfish intentions. The purpose of this study to analyze “the relationship of cognitive ability to religious beliefs and political interest/participation” could have been done in a tremendous variety of ways, most of which do not have to utilize members of a dating app. Individuals who disclose information about themselves in a dating app clearly do not expect that their vulnerable answers (meant for potential love interests) will be consumed at a large scale by others not in the dating pool. Furthermore, none of the participants were notified of their participation in the study. As mentioned above, there is a huge list of potential quite dangerous harms that could be posed to the victims of this study that should have been evaluated as the participants are the main stakeholders in this study. Especially if the study was meant to focus on the cognitive-religious-political relationship amongst OkCupid users: they are the main stakeholders in a study done by them and for them. Therefore, their needs should have been taken into account. The purpose of this study seems to be out of the interest of the researchers and not for any actual benefits/changes in society, so I don’t believe that they could possibly outweigh the risks of harm.

Question 3.5: List the various stakeholders involved in the OkCupid case, and for each type of stakeholder you listed, identify what was at stake for them in this episode. Be sure your list is as complete as you can make it, including all possible affected stakeholders.

Researcher: standing in scientific community, grades, personal interest was at stake
Research team: membership in team, job, grades at stake
OkCupid: community trust, community guidelines, public image, respectability at stake
Users: safety, privacy, wellbeing, mental health, jobs, grades, autonomy at stake
Non-Users (the rest of the public): access to this knowledge/interest at stake

Question 3.6: The researchers’ actions potentially affected tens of thousands of people. Would the members of the public whose data were exposed by the researchers be justified in feeling abused, violated, or otherwise unethically treated by the study’s authors, even though they have never had a personal interaction with the authors? If those feelings are justified, does this show that the study’s authors had an ethical obligation to those members of the public that they failed to respect?

Yes! Absolutely. These feelings are justified because the information disclosed to OkCupid clearly makes these users identifiable even when their name is not on the dataset. They were publicly exposed and stripped of their privacy and autonomy to choose what parts of them to share with the world. They also had no knowledge of/claim in the study and were not at all compensated. The authors failed to respect the human dignity and life interests, therefore failing their ethical obligation to the public.

Question 3.7: The lead author repeatedly defended the study on the grounds that the data was technically public (since it was made accessible by the data subjects to other OkCupid users). The author’s implication here is that no individual OkCupid user could have reasonably objected to their data being viewed by any other individual OkCupid user, so, the authors might argue, how could they reasonably object to what the authors did with it? How would you evaluate that argument? Does it make an ethical difference that the authors accessed the data in a very different way, to a far greater extent, with highly specialized tools, and for a very different purpose than an ‘ordinary’ OkCupid user?

I am sure that in the OkCupid user agreement, it does not state that you are forfeiting your information for public use. Submitting personal facts to a group within society (one where you can share things that you may not share in your family/work/day-to-day environment), is extremely objectively different than sharing personal facts blatantly on the internet with everyone you know. Especially in the context of the information that you choose to share with individuals whom you hope to have a romantic/sexual relationship with: this is obviously a different level of intimacy than the parts of yourself that you share openly with everyone. Each group and individual in society is impacted by our information in different ways. Consciously or subconsciously, people take that into account when they choose what to share specifically on a dating app versus on their FaceBook timeline or on their LinkedIn. I think that this is an absurdly weak, naive argument, and frankly, that it makes this head researcher look like an asshole.

Question 3.8: The authors clearly did anticipate some criticism of their conduct as unethical, and indeed they received an overwhelming amount of public criticism, quickly and widely. How meaningful is that public criticism? To what extent are big data practitioners answerable to the public for their conduct, or can data practitioners justifiably ignore the public’s critical response to what they do? Explain your answer.

That criticism is extremely meaningful and informs future researchers on the guidelines for acceptable behavior towards the public. If there is some scenario in which big data practitioners privately use public information for the explicit benefit of society, and in which they do not harm participants or ask for permission to use their info, then I think that they can ignore criticism of the ethics of using big data (but not criticism of the implications/validity of their data if it is arbitrary/problematic). However, if practitioners are wronging or harming individuals in their practice, just like any doctor/lawyer/artist violating code of conduct, they should have to answer for themselves in a court of law.

Question 3.9: As a follow up to Question 3.7, how meaningful is it that much of the criticism of the researchers’ conduct came from a range of well-established data professionals and researchers, including members of professional societies for social science research, the profession to which the study’s authors presumably aspired? How should a data practitioner want to be judged by his or her peers or prospective professional colleagues? Should the evaluation of our conduct by our professional peers and colleagues hold special sway over us, and if so, why?

I think that this is very meaningful! It is hard to receive this type of criticism from participants/non-technical public observers because they largely won’t be aware of the study or the implications of it. A data practitioner should strive to live up to the ethical standards of the field and of those who know what it means to work ethically very well. The critique of informed, diverse colleagues who work in our field or the field of ethics should certainly sway us and push us to be better, just as we should strive to be the type of practitioner who pushes others to be better.

Question 3.10: A Danish programmer, Oliver Nordbjerg, specifically designed the data scraping software for the study, though he was not a co-author of the study himself. What ethical obligations did he have in the case? Should he have agreed to design a tool for this study? To what extent, if any, does he share in the ethical responsibility for any harms to the public that resulted?

If he knew the intent of use of his software, he is responsible for its use just as much as the author of the study. A talented engineer who can create a software like this should surely be able to make other softwares/be employed in other places where he is not putting the users of his software in harm’s way.

Question 3.11 How do you think the OkCupid study likely impacted the reputations and professional prospects of the researchers, and of the designer of the scraping software?

I would hope that this study affected the professional prospects/reputations of the researchers and designers of this scraping software. I certainly would not want them on my design/working team after they felt good about this project unless they publicly exhibited regret, growth, and active change. I don’t know if, realistically, they did, but I hope that in the future people like them do face professional repercussions.

Case Study Five

Question 5.5: Identify the 5 most significant ethical issues/questions raised by this study.

Not considering the human lives and interests behind the data: They created an algorithm that posed a threat to the individual’s whose pictures were used. The stakeholders stood to gain nothing and lose a lot.
Lack of Focus on Downstream Risks and Uses of Data/Missing the Forest for the Trees: They did not consider how evil regimes/individuals could use the algorithm or dataset harmfully in the future to out LGBTQIA+ individuals.
Did not Invite Diverse Stakeholder Input: They had a very biased data set, excluding huge populations of individuals (100% white).
Did not Promote Values of Transparency, Autonomy, and Trustworthiness: They would not release their algorithm.
Did not Establish Chains of Ethical Responsibility and Accountability: They fought claims and failed to acknowledge problematic aspects of the study rather than taking responsibility.

Question 5.6: Identify 3 ethical best practices listed in Part Five that seem to you to be closely related to the issues you identified in Q5.5, and to their potential remedies.

Not considering the human lives and interests behind the data: They should have reevaluated the conditional good of big-data and assessed if it was really for the good of society for their algorithm to be created. Do we really need to have softwares that guesses people’s sexual preferences based off of their pictures? Why?
Lack of Focus on Downstream Risks and Uses of Data/Missing the Forest for the Trees: In not assessing the downstream risks of their dataset/idea, they put many at risk. Again, they should have evaluated whether their algorithm needed to be created. If they still decided to go forward, they should have released a statement very explicitly making it clear who/what the algorithm was for, whom it was to be used by, and how effective it was.
Did not Invite Diverse Stakeholder Input: They undoubtedly should have included diverse racial/gender identifying individuals in their dataset. This was a gross oversight.

Data Science Ethics: Case Studies

Written by Sami Wurm