LibGuides: AI Literacy in the Biomedical Sciences: Ethical Considerations

Environmental & Human Impact

Generative AI tools do have a real environmental impact: since these tools require massive amounts of computer processing power to be trained and run, their parent companies consume huge amounts of energy and fresh water. This also means that every individual prompt has a real cost in energy, water, materials, and money. Google recently released an environmental report with details on how they offset these concerns - keep in mind this applies to ONLY Google's AI products, which includes products that don't use generative AI.

There is also a human impact that disproportionately affects the global south. For example, some of the training for OpenAI's ChatGPT was outsourced to Kenyan workers making sub-minimum wage. OpenAI has also recently requested $7 trillion dollars in funding to increase the production of GPUs used for their AI tools, an amount of money that could have major impacts on the global economy.

Equity

One of the major potential positive impacts of AI tools is to level the playing field for people of various abilities and skill levels. For example, ChatGPT does a great job translating text, making text written by non-native English speakers flow better, and OCRing text from images. The disability community (this article is written by a Tufts professor!) is optimistic about the potential for generative AI tools to make some tasks easier for some people. However, tools are not accessible by default and must be built mindfully to be accessible for all - it's not uncommon for chatbots to be inaccessible to screen readers, for example.

Bias Flyer

shared under a Creative Commons Attribution 4.0 International license | PDF | accessible text

Data Privacy

Sensitive Personal Identifiable Information (PII), including HIPAA-, FERPA-, and GDPR-protected data, should NEVER be input into any AI tool unless you know that it is a secure in-house system.

In general, text that is input into an AI system is retained and may be analyzed by other humans as part of a quality control check or re-input back into the AI model to train future iterations of the model. If you would not want another person reading what you write, do not put it in a chatbot.

The data used to train the AI model is also not kept secure; it is possible to jailbreak some chatbots to force them to reveal some of the data used to train them, which could include sensitive PII.

Licensing & Copyright

In the United States, AI-generated materials cannot be copyrighted, because a human creator is required for material to hold copyright. When, as is often the case, humans and generative AI tools collaborate to create materials, this can lead to situations where some parts of a work are copyrighted and some are not. For example, the comic book Zarya of the Dawn was written by humans, but contains AI-generated art. In this case, the writing is covered by copyright, but the art is not.

Fair Use

Fair use refers to a doctrine in the United States that allows for the use of copyrighted materials without permission in some circumstances. It still isn't clear whether the creation and use of generative AI tools falls under fair use. One common standpoint, which is the official stance of the American Library Association, is that using copyrighted materials to train generative AI models is fair use, but that the models outputting copyrighted material is not.

Unfortunately, because of how the current generation of generative AI models have been trained, it may be very difficult or impossible to keep them from regurgitating copyrighted materials from their training data. In fact, more complex models are more likely to repeat their training data verbatim than smaller, less complex models. Image generators in particular are notorious for creating images of copyrighted characters and likenesses, but this has been an issue with text generating tools, too. The New York Times has sued OpenAI over breach of copyright, as have groups of both novelists and non-fiction writers.

Public domain

Not all creative works are copyrighted - many are also in the public domain. Works can be in the public domain because their copyright was waived, because they weren't copyrightable, or after a certain amount of time has passed since their creation. Public domain works are free to copy, share, and remix in a way that copyrighted materials aren't - this is why a website can share the entire text of Moby Dick with no legal repercussions. Generative AI tools do not seem to understand which works in its training corpus are copyrighted vs. public domain. This means that they sometimes refuse to answer questions on copyright grounds, even when the work in question isn't copyrighted!

Hallucinations

Hallucinations, or false or misleading information produced by generative AI tools, continue to be a problem that plagues chatbots and other AI tools. Hallucinations may be an unavoidable consequence of how generative AI is programmed, and, as of this writing, there is no such thing as a "hallucination-free" AI tool. For more information about hallucinations and how to reduce them, check out the Hallucinations page on this guide.

Plagiarism

Because generative AI tools sometimes regurgitate copyrighted material, using generative AI can set you up for accidental plagiarism, or accusations of plagiarism. There is a deeper issue at play, though - AI tools cannot cite their sources. They can give you information, but they do not know where that information comes from. You can ask it to give you sources, but the citations they provide (when they are real) are never guaranteed to be the actual source of the information, or even related to any information the AI has provided.

Using AI tools for even cursory research can make it impossible to cite the original sources of our ideas (even citing the chatbot won't fix this - the ideas did not originate from the AI, but its training material). One of the most important reasons we cite sources in academic literature (including classwork!) is to acknowledge the work of others who came before us. When we use AI chatbots instead, we break the chain of knowledge, and the work of acknowledging other scholars becomes much more difficult. Failure to credit others for their work, even when you found the ideas in an AI chatbot and cite that bot, does constitute plagiarism.

Bias

Generative AI is mostly trained using data that is found online, including text-rich websites like Wikipedia and Reddit, as well as sites that host images and video, like DeviantArt and YouTube. This opens a trove of ethical issues, including issues of copyright, consent, and attribution. It also leads to tools that are biased in the same way as these websites. Wikipedia, for example, has a strong Western bias, with the majority of its editors being men from the US and Europe. AI tools have a tendency to amplify any patterns in their training data, so any biases in the training pool will be amplified in any of the AI's outputs.

Examples of biased AI outputs

Before the recent rise of generative AI, Amazon had attempted to use machine learning to create a tool to help them screen resumes more quickly. To build this tool, they would feed the model real resumes from job hunters, and tell the AI which were accepted and which were rejected for interviews. However, the resulting AI tool ended up with a strong bias against female candidates that the programmers were unable to fix.
As large language models have gotten larger and more sophisticated, they have become less likely to display overt racism in their responses (such as using offensive language or declaring any race to be superior or inferior to another). However, a recent study found that larger models are more likely to display covert racism. In this study, a chatbot was given statements that were written in standard US English and in African American vernacular English (AAVE). When asked to describe the writers of each statement, the chatbot displayed a strong bias against writing that used AAVE.
Attempts to correct these biases can just lead to the tool having different biases. Likely in response to reports that image-generating AI were racially biased, Google updated their image-generating AI to produce racially diverse portraits when prompted for pictures of people. However, this led to the model not producing images of white people, even in contexts where it would make sense.

AI Use Declarations

Because of the various issues with using AI tools, including the risk of plagiarism, bias, hallucinations, and copyright snafus, it is important to acknowledge when you use AI tools to create, supplement, or modify your work. When creating a statement to declare the use of an AI tool, consider including the following information: