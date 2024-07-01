A new study of changing vocabulary in academic papers finds that certain words are becoming more common.

Was artificial intelligence behind that email you just got? Did ChatGPT draft your office’s new memo?

Answering novel questions like those is tricky, but a new study from researchers at Northwestern University and Germany’s University of Tübingen suggests there might be a way to tell.

FEATURED VIDEO An Inc.com Featured Presentation

AI text generation has exploded as large language models–or LLMs, the software underpinning ChatGPT, Google’s Gemini, and other consumer chatbots–have grown more sophisticated and popular in the past few years. And it’s already shaking up workplaces: research conducted by Slack indicates that one out of every four desk workers has already tried using AI at work. Now, a scientific paper released on the preprint research repository arXiv, titled “Delving Into ChatGPT Usage In Academic Writing Through Excess Vocabulary,” offers a method to help differentiate AI-generated text from something a human wrote.

The researchers studied changes in vocabulary across 14 million academic article abstracts published between 2010 and 2024. Their results showed how the emergence of LLMs “led to an abrupt increase in the frequency of certain style words,” they wrote, adding: “Our analysis based on excess words usage suggests that at least 10% of 2024 abstracts were processed with LLMs.”

The linguistic changes that followed the mainstreaming of the AI software “were unprecedented in both quality and quantity,” the researchers wrote in the article, which was released in mid-June. So what words did the AI seem to be overusing? Some of the terms that the researchers found “strong excess usage” of in 2024 academic papers included “delves,” “showcasing,” “underscores,” “potential,” “findings” and “crucial”–although almost 300 other words also became abruptly more popular.

Of course, there are other reasons that these words might have blown up in the last few years. But the scientists note that in the past, words that saw this sort of sudden surge in usage tended to be content-specific nouns–such as “ebola” in 2015 and “zika” in 2017, as well as “coronavirus,” “lockdown,” and “pandemic” between 2020 and 2022–rather than the sort of adjective and verb style words that increased in usage after AI hit the mainstream.