LibGuides: AI Literacy in the Biomedical Sciences: AI in Research

Sloppy LLM Use in Google Scholar

To find papers that likely used LLMs without removing the pleasantries that the bot produced at the beginning of its response, copy the next line (including the double quotes) into the Google Scholar search box, click "search," and limit the date range to "since 2023":

"certainly, here is" -chatgpt -llm

Research About AI

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜
An early paper on the potential risks and downsides of LLMs and similar large AI models.
AI meets comedy: Viewers' reactions to GPT-4 generated humor translation
Perhaps surprisingly, it turns out that ChatGPT is at least as good as human translators at effectively translating jokes and keeping them funny!
Instructors as Innovators: a Future-focused Approach to New AI Learning Opportunities, With Prompts
A preprint describing how to use and prompt LLMs to act as effective tutors for student learners.
Medical students’ AI literacy and attitudes towards AI
In medical students surveyed, AI education and experience is positively correlated with positive attitudes about AI tools.
Generative AI enhances individual creativity but reduces the collective diversity of novel content
AI helps improve the quality of short creative works...but AI-generated works are more homogeneous than human-generated.

Reproducibility

There are two different issues with the reproducibility of AI tools in research. First, using a generative AI tool isn't reproducible - this can be an issue if you are using chatbots or image generators to help with writing or images. The second issue is for researchers who are creating generative AI tools - sometimes the tools themselves are not reproducible, and other researchers cannot replicate their results.

For ideas on what to include in your papers to improve reproducibility, check Association for the Advancement of Artificial Intelligence's Reproducibility Checklist, and the Machine Learning Reproducibility Checklist (PDF).

Open Data and Open Source Software

One common type of AI research in biomedicine is the process of building generative AI tools to diagnose disease. These proposed models will usually look at medical images, like X-rays or MRI scans, and search for indications of diseases like cancer, COVID, or dental caries. However, many of these studies are not reproducible. In one case, an AI built to detect COVID in lung x-rays would even "detect" COVID in images of random noise.

Part of the reason these studies are not reproducible is because researchers are not sharing their source code or training data. Having access to both is necessary for reproducibility because a tool that works well in one environment may perform differently in another, but researchers will not know if that is the case without being able to see how the tool was built.

Sharing both data and software is a crucial step of research data management (RDM). For RDM help, contact rdm@tufts.edu.

Publishing

Publishers have various policies when it comes to using generative AI to help produce papers. Before you use an LLM or image generator to help you with a paper, you should look up the policies for any journals you plan to send the manuscript to - try Googling "[journal name] policy on generative AI," or asking your liaison librarian for more help. Usually, journal policies fall into one of two camps:

AI cannot be used to help write a manuscript at all, or,
AI tools can be used but their use must be disclosed.

Generally, a generative AI tool cannot be named as an author on a research paper.

If you do use AI tools for your research paper, you must check every output for accuracy. Even papers that pass peer review can be retracted later due to authors misusing AI to create text or figures, or even to fabricate data. You also don't want to end up on Retraction Watch's list of articles with evidence of ChatGPT writing.

Peer Review

Some publishers, but not all, also have policies for using generative AI for peer review. Whether a journal has a policy or not, it is generally recommended to never use generative AI tools to assist with peer review. This is because many AI tools use your prompts to train their models, and authors generally do not give permission for their work to be used for this purpose when they submit to journals. Also, remember that anything that you prompt into an AI tool is generally not securely stored! Would you want your draft manuscripts widely shared around the Internet?

In addition, the comments that tools like ChatGPT create for peer review just aren't very good. If you're a peer reviewer, think about why you are doing it - if the point of peer review is to share robust, helpful comments to improve the scientific literature, LLM reviews aren't doing that.

(That being said, some AI tools can be useful for reviewing your own work before you send it to journals, as long as you understand that any work you input into the tool may be used by the company that owns it.)

Plagiarism

Because generative AI tools sometimes regurgitate copyrighted material, using generative AI can set you up for accidental plagiarism, or accusations of plagiarism. There is a deeper issue at play, though - AI tools cannot cite their sources. They can give you information, but they do not know where that information comes from. You can ask it to give you sources, but the citations they provide (when they are real) are never guaranteed to be the actual source of the information, or even related to any information the AI has provided.

Using AI tools for even cursory research can make it impossible to cite the original sources of our ideas (even citing the chatbot won't fix this - the ideas did not originate from the AI, but its training material). One of the most important reasons we cite sources in academic literature (including classwork!) is to acknowledge the work of others who came before us. When we use AI chatbots instead, we break the chain of knowledge, and the work of acknowledging other scholars becomes much more difficult. Failure to credit others for their work, even when you found the ideas in an AI chatbot and cite that bot, does constitute plagiarism.

Data Privacy

Sensitive Personal Identifiable Information (PII), including HIPAA-, FERPA-, and GDPR-protected data, should NEVER be input into any AI tool unless you know that it is a secure in-house system.

In general, text that is input into an AI system is retained and may be analyzed by other humans as part of a quality control check or re-input back into the AI model to train future iterations of the model. If you would not want another person reading what you write, do not put it in a chatbot.

The data used to train the AI model is also not kept secure; it is possible to jailbreak some chatbots to force them to reveal some of the data used to train them, which could include sensitive PII.

Citing AI Tools

The use of AI tools in academic writing (including class assignments!) should always be properly declared and cited. Citation manuals differ on whether the AI is treated as an author and whether the text of the prompt should be included in the citation. Keep in mind that citing chatbot use is not sufficient to prevent accusations of plagiarism.

The example citations all cite this chat.

APA
- Author. (Date). Name of tool (Version of tool) [Large language model]. URL
- OpenAI. (2024). ChatGPT (April 29 version) [Large language model]. https://chat.openai.com/chat
- The "Author" should be the creator of the tool; e.g., the "Author" when you use ChatGPT should be OpenAI.
Chicago
- Author, Title, Publisher, Date, url for the tool.
- ChatGPT, response to "What is a good PubMed search for papers about COVID treatment?", OpenAI, May 13, 2024, https://chat.openai.com/chat.
- The "Author" in this case is the AI tool, while the "Publisher" is the tool's creator. So, when you use ChatGPT, the author should be ChatGPT, and the publisher should be OpenAI. The "Title" is your prompt, written as 'response to "[prompt]."'
MLA
- "Description of chat" prompt. Name of AI tool, version of AI tool, Company, Date of chat, URL.
- "What is a good PubMed search for papers about COVID treatment?" prompt. ChatGPT, 29 Apr. version, OpenAI, 13 May 2023, https://chat.openai.com/chat.
Vancouver Style has no formal recommendations for citing AI correspondence. Most authors use a modified version of the format for personal correspondence.
- Author (Author Affiliation). Output requested from: AI Tool (Company). Date. Available from: URL
- Tatarian, A (Hirsh Health Sciences Library, Tufts University, MA). Output requested from: ChatGPT (OpenAI). 2024 May 13. Available from: https://chat.openai.com/chat
  - Affiliation info in first parentheses (department, organization, state) only required when citing someone besides yourself.
- The "Author" in this case is the person who was using the AI tool. Author affiliation is only required when the "Author" was not you.

Need Help?

Ask Us

Access Knowledge Base
Phone 617-636-6705
Email hhsl@tufts.edu
Text 617-477-8493

Desk Hours: M-F 7:45am-5pm