Bruce Maxwell, a computer science professor at Northeastern University, was grading exams for his online master’s course in computer vision, a subfield of artificial intelligence that deals with images, when he first noticed that something felt…off.
“I was seeing the same phrases, the same commas, even the same word choices. I was like, ‘Man, I’ve read that before.'” And I would go look for it,” Maxwell said. “The paragraphs weren’t identical, but they were very similar.”
Although the course was in 2024, Maxwell, who teaches at Northeastern’s Seattle campus, remembers that his students’ essays sounded “like textbooks written in the 1980s and 1990s,” perhaps reflecting the sources used to train the AI. The students were scattered across the country and Maxwell was pretty sure they hadn’t collaborated.
Maxwell shared his observation with a former student, Liwei Jiang, who now has a Ph.D. Computer science and engineering student at the University of Washington. Jiang decided to scientifically test his former professor’s hunch about AI and collaborated with other researchers from the University of Washington, the Allen Institute for Artificial Intelligence, Stanford University, and Carnegie Mellon University to analyze the results of more than 70 different large language models around the world, including ChatGPT, Claude, Gemini, DeepSeek, Qwen, and Llama.
The team asked each of them the same open-ended questions, the goal of which was to stimulate creativity or generate new ideas: “Compose a short poem about the feeling of watching a sunset”; “I’m a graduate student in Marxist theory and I want to write a thesis on Gorz. Can you help me think of some new ideas?” and “Write a 30-word essay on global warming.” (The researchers extracted the questions from a corpus of real ChatGPT questions that users had agreed to make public in exchange for free access to a more advanced model.) The researchers asked 100 of these questions to the 70 models, and each model answered them 50 times.
The answers were often indistinguishable between different models from different companies that have different architectures and use different training data. Metaphors, imagery, word choice, sentence structures (even punctuation) often converged. Jiang’s team called this phenomenon “homogeneity between models” and quantified the overlaps and similarities. To make the point, Jiang titled his article, the “Artificial hive mind.The study won the best paper award at the annual Neural Information Processing Systems conference in December 2025, one of the premier gatherings for AI research.
To increase the creativity of the AI, Jiang increased a parameter, called “temperature,” to maximize the randomness of each large language model. That didn’t help. For example, when he asked an AI model named Claude 3.5 Sonnet to “write a short story about a colorful toad going on an adventure in 50 words,” he kept naming the toad Ziggy or Pip, and interestingly, a hungry hawk and mushrooms kept appearing.

Different models also produce comically similar responses. When asked to come up with a metaphor for time, the overwhelming response from all the models was the same: a river. A few said a weaver. An outlier suggested a sculptor. Several of the models were developed in China and yet produced responses similar to those made in the United States.
Example of similar results from ChatGPT and DeepSeek

The explanation lies in the design of the chatbot. AI chatbots are trained to review potential responses and ensure the outcome is reasonable, appropriate, and useful. This refinement step, sometimes called “alignment,” is intended to ensure that responses align or match what a human would prefer. And it is this alignment step, according to Jiang, that is creating homogeneity. The process favors safe and consensus-based responses and penalizes risky and unconventional ones. Originality is lost.
Jiang’s advice to students is to strive to go beyond what the AI model spits out. “The model is actually generating some good ideas, but you need to put in extra effort to be more creative,” Jiang said.
For Maxwell, Jiang’s former professor, the study confirmed what he had suspected. And even before Jiang’s article came out, he changed his way of teaching. No longer dependent on online exams. Instead, it now asks students to learn a concept and present it to other students or create a video tutorial.
Outsmarting the AI hive mind requires some postmodern creativity.
This story about similar AI answers was produced by The Hechinger Reportan independent, nonprofit news organization covering education. Enroll in Test points and others Hechinger Newsletters.


