Since the unveiling of OpenAI’s ChatGPT in late November 2022, the large language model (LLM) has been the source of endless speculation – experts have predicted that the technology will do everything from doing away with English classes in high schools to displacing white-collar workers to hijacking democracy. The program has caused consternation within the education sector in particular, as school systems across the country have banned its use, prompted by fears it will become a tool for widespread cheating. Given the complexity of the program, the concerns are understandable: This week, ChatGPT passed both the United States Medical Licensing Exam (USMLE) and a final exam in a Wharton School MBA course.
Given the program’s ability to pass a high-level course at a prestigious university, it can successfully earn a student admission to a similar high caliber school? Can ChatGPT create an admissions essay that stands out in the competitive world of Ivy League admissions?
Analysis of program performance on the USMLE and MBA exam provides some useful insights into what ChatGPT can and cannot do within the college admissions landscape. In both cases, perhaps the most remarkable aspect of the program’s performance was its ability to respond to case studies, drawing and explaining conclusions that did not come directly from dataset inputs. LLMs, of which ChatGPT is perhaps the most advanced, use a huge amount of language input to generate and predict text based on probability. ChatGPT is particularly notable for its impressive and complex ability to mimic the nuances of human speech and generate responses that are highly contextually informed.
Both the USMLE and MBA exams contain nuanced scenario questions, often with difficult or misleading wording—in other words, questions and problems that seemingly require deductive reasoning. Despite the dense and tricky nature of the questions, ChatGPT was able to provide well-founded and largely correct answers. Christian Terwiesch, co-director of the Mack Institute for Innovation Management and the researcher who administered the MBA exam, writes that ChatGPT “does a great job on fundamental operational management and process analysis questions, including those based on case studies. Not only are the answers correct, but also the explanation is excellent.” Similarly, the researchers who analyzed the program’s results at the USMLE found that “AI-generated responses also provided significant insight, role modeling a deductive reasoning process valuable to human learners… There was at least one significant insight present in about 90% of the results. ChatGPT therefore has the partial ability to teach medicine by bringing up new and non-obvious concepts that may not be in the learner’s sphere of consciousness. While other previous iterations of LLMs get answers from data entered directly into the programs, ChatGPT was able to detect information adjacent to the information entered, giving the appearance of independent reasoning.
However, appearance is the critical term. While it is an incredible achievement that the program can respond to nuances in the presented scenarios – nuances we usually associate with human reasoning – using inferences to relate the scenarios to adjacent information in the dataset is not the same as human reasoning.
So what does this mean for the college entrance exam?
While the program is well equipped to pass even complex tests, it cannot be a substitute for innovation, creativity, and critical thinking – traits that students are expected to demonstrate in the college admissions essay. The most successful essays clearly convey a student’s unique personality. As TSTIME’ Emma Whitford has demonstrated with clear instructions that ChatGPT can produce a simple admissions essay, but the examples generated lack the guts and imagination that stand out among top schools.
The University of Chicago’s “uncommon essay” prompts further illustrate ChatGPT’s inability to reason humanly. The directions require students to approach topics with ingenuity and originality. One of the most infamous prompts in the past asked students to simply “find X”. When I put this prompt directly into ChatGPT, the program gave a standard explanation: “x is a variable that can represent any number or mathematical expression. It is usually used in equations to represent the unknown quantity to solve for.
Even with more specific instructions, the program failed to come up with a unique or truly creative answer. When I instructed the program to “write a creative essay for the ‘Find x’ prompt”, it generated this essay:
It was the summer of my senior year of high school and I was about to discover something extraordinary. I had spent the past few months studying math and physics, absorbing every detail with an eagerness that only a curious mind can appreciate. One day my teacher assigned us to “find x” – an equation that could unlock the answers to some complex questions.
At first I was overwhelmed by the task. I felt like I was looking for a needle in a haystack – and the stakes were high. I knew I had to find x if I wanted to fulfill my dream of becoming a successful scientist. With a determination that rivaled that of a lion, I set to work.
I read and reread my textbooks, pouring every equation and concept with a magnifying glass. I experimented with different approaches, encountered dead ends and wrong turns, but never gave up. I was determined to find x.
Finally, after weeks of hard work and dedication, I had a breakthrough. I had an epiphany – I had found x! It was an equation that had the potential to revolutionize the world of math and physics.
I was ecstatic. I was so proud of myself that I never gave up, even when the odds were against. I had done the impossible – I had found x!
This experience has taught me that perseverance, hard work and a little bit of luck can help you achieve anything. Even if you feel like you’re looking for a needle in a haystack, never give up – you might just find x.
While the prose reads like an excited high school student, the details of the essay itself leave a lot to be desired. In fact, ChatGPT completely bypasses the prompt by using the method of finding x instead of actually find x – or at least, defining it.
Of course, this is not to say that a student’s essay, like ChatGPT, is only as good as its input. Prestigious colleges are increasingly paying attention to student activities and how those activities provide a cohesive and compelling picture of who the student is and what they are passionate about. A great essay will tie all those activities together into a holistic narrative with enthusiasm and creativity, but no essay – no matter how well written – can artificially produce four years of useful engagement with someone’s interests.
Understanding ChatGPT’s limitations is almost as important as recognizing its strengths. While the program has much to offer students in a variety of areas, it is not a substitute for a student’s own voice and reasoning. Advances in the world of artificial intelligence should push colleges to increasingly create essay prompts that encourage critical thinking and originality and continue to emphasize students’ activities and demonstrable interests as a primary consideration in the admissions process.