MIT Paper: Relying on ChatGPT Weakens Learning and Memory

A new study from MIT and other researchers suggests that using ChatGPT to help write essays may impair memory, reduce engagement, and erode a student’s sense of ownership over their work.

The researchers conducted a months-long study involving 54 students from five universities in the Boston area. Participants were randomly assigned to one of three groups: those who used ChatGPT (LLM group), those who used traditional web search (Search group), and those who relied only on their own knowledge (Brain-only group) to write essays.

“In this study, we demonstrate the pressing matter of a likely decrease in learning skills based on the
results of our study,” the authors said. “The use of LLM had a measurable impact on participants, and while the benefits were initially apparent … the LLM group’s participants performed worse than their counterparts in the Brain-only group at all levels: neural, linguistic, scoring.”

But Hidenori Tanaka, lead scientist for NTT’s research physics of AI group, said the MIT study doesn’t necessarily mean the decline of human intelligence.

Take the example of using generative AI for writing. “We are evolving to reallocate brain power from tasks that may no longer be required in the future,” Tanaka contended. “AI is catalyzing a cognitive evolution, prompting the brain to rewire itself and reallocate mental resources in response to new demands. That does not necessarily mean cognitive decline is occurring.”

Tanaka said humans have historically offloaded mental tasks after adopting new technologies. Case in point is GPS, where people have stopped memorizing driving routes and using paper maps. “The ‘freed up’ space in the brain that is used for distractions or passive consumption can be used for more creative and strategic thinking – and that would be a net positive,” he said.

Wharton professor Ethan Mollick pointed to other research findings that are contrary to the MIT study. “Does that mean that AI always hurts learning? Not at all! While it is still early, we have increasing evidence that, when used with teacher guidance and good prompting based on sound pedagogical principles, AI can greatly improve learning outcomes,” he wrote in his substack, One Useful Thing.

Mollick pointed to the following studies as evidence:

A randomized, controlled World Bank study showed that using a GPT-4 tutor with teacher guidance in a six-week after school program in Nigeria had “more than twice the effect of some of the most effective interventions in education” at very low cost.
A Harvard experiment in a large physics class found a well-prompted AI tutor outperformed active classes in learning outcomes.
A study done in a programming class at Stanford found use of ChatGPT led to higher exam grades.
A Malaysian study found AI used in conjunction with teacher guidance and solid pedagogy led to more learning.

“Ultimately, it is how you use AI, rather than use of AI at all, that determines whether it helps or hurts your brain when learning,” Mollick concluded.

LLM use changes neural patterns

In the paper, the researchers measured brain activity using EEG headsets during essay writing to understand how these tools affect cognition. They also used NLP tools to analyze the essays and interviewed participants afterward. A fourth follow-up session involved swapping conditions: The LLM group had to write without assistance, and the Brain-only group used ChatGPT for the first time.

The results suggest that relying on ChatGPT consistently led to lower cognitive effort and diminished memory performance. “The LLM group also fell behind in their ability to quote from the essays they wrote just minutes prior,” the study said. By contrast, participants who had written their earlier essays without any digital help demonstrated stronger memory and higher perceived ownership.

The researchers found that the longer someone used an AI writing assistant, the more their neural patterns changed. “In session 4, LLM-to-Brain participants showed weaker neural connectivity and under-engagement of alpha and beta networks,” they wrote. Having previously relied on ChatGPT, these participants’ brains didn’t fully “reboot” when asked to write without AI.

Meanwhile, the Brain-to-LLM participants – those who only used ChatGPT in the final session – showed stronger memory recall and broader brain activation.

Despite this, essays written with ChatGPT were often judged higher in quality by both human and AI evaluators. Yet the LLM group was less likely to remember what they had written or to feel the essays reflected their own thinking. In the interviews, some participants called ChatGPT’s writing “robotic” and said they felt they needed to edit the responses to sound more human.

“Impaired perceived ownership” and a reduced ability to quote were consistently observed in the LLM group, the researchers found. In session one, 83.3% of ChatGPT users could not quote their own writing, compared to only 11.1% in the Search and Brain-only groups.

The study also looked at the linguistic structure of the essays. Those written with ChatGPT shared similar patterns, including use of certain repeated phrases and topic structures. This lack of variety stood in contrast to the Brain-only group, whose essays showed wider topical and lexical diversity.

While the search group also leaned on external resources, the participants were more selective and showed higher integration of information. “The Search Engine group had strong ownership, but lesser than the Brain-only group,” the study noted.

Author

Deborah Yao

Deborah Yao is the editor-in-chief of The AI Innovator.

View all posts