The Precariousness of Trusting AI in Professional Settings
Ben Taylor (he/him) // Crew Writer
Andrei Gueco (he/him) // CrIllustrator
As LLMs (large language models) like ChatGPT continue to make their way into all sectors of our lives, strange phenomena known as AI hallucinations have begun to occur. An AI hallucination in this context refers to when LLMs confidently present false, oftentimes fabricated information, as fact. These mishaps can have major repercussions, especially when AI is being used in situations with real-world consequences.
One such instance occurred in early 2024 at the B.C. supreme court. In a high profile divorce case, lawyer Chong Ke cited two cases as evidence that her client should have been able to take his children to China. The only problem: the two cases didn’t exist. Ke, having turned to ChatGPT, had presented false evidence in court which, if missed, could have had a considerable impact on the lives of the children. This has not been the only incident involving AI in the courtroom. A year prior, New York based lawyer Steven Schwartz was sanctioned for submitting hallucinations in a personal injury suit. While the hallucinations were caught in these two cases, the potential for catastrophe is apparent.
An article in the MIT AI research hub sheds some light on the causes of AI hallucinations. The biggest reason AI is able to invent false data and present it as real is due to the location generative AI gets its data from in the first place: the internet. Because these systems are trained on the internet, the data they pull from is not always accurate, hence the hallucinations. Another reason for AI’s misuse of data is because of the method in which it comes up with answers to a user’s question. LLMs work like a hyperintelligent autocomplete; essentially using a massive pool of information to predict a reasonable answer, not necessarily a truthful one. Moreover, researchers at Open AI suggest that “guessing when uncertain improves test performance” and that LLMs tend to hallucinate because they are designed to be good test takers. This means that the use of LLMs in critical situations poses a big risk, but despite this, many professionals are still putting their trust in ChatGPT. In any context—professional or otherwise—the use of potentially made up data is irresponsible, but unless new legislation is introduced, the expedience of LLMs will likely prove too attractive to dissuade their use.
But, lawyers are not the only professionals being tempted by the precarious convenience of AI, doctors have also begun to use a new AI called Whisper. Developed by Open AI, the creators of ChatGPT, Whisper is an automatic speech recognition software used to create transcripts of doctor-patient interactions. However, according to an article from PBS, Whisper has begun hallucinating text, leading to a higher possibility of misdiagnosis among various other complications. Hospitals are using this tool as a way to increase efficiency, but the risk it presents is evident; basing medical care off of hallucinated patient-doctor interactions is a recipe for disaster. OpenAI has stated that they recommend manual verification of transcripts in “high risk domains,” but with a lack of regulation, it is doubtful hospitals will forgo the efficiency and ease that Whisper provides.
AI simply isn’t ready to be utilized in situations with stakes as high as a courtroom or a hospital, as human analysis is still required to differentiate fact from fiction. However, as these programs continue to evolve, it becomes difficult to imagine a world where AI isn’t being used in most professional settings, regardless of the risk it presents.