Generative AI Reaches Average Human Creativity, But Elite Human Minds Maintain Superiority in Landmark Study

Can generative artificial intelligence systems like ChatGPT genuinely create original ideas? A new study led by Professor Karim Jerbi from the Department of Psychology at the Université de Montréal, with participation from renowned AI researcher Yoshua Bengio, takes on that question at an unprecedented scale. The research, published in Scientific Reports (Nature Portfolio) on January 21, 2026, represents the largest direct comparison ever conducted between human creativity and the creativity of large language models (LLMs), revealing a significant shift in the landscape of artificial intelligence capabilities. While generative AI systems have now reached a level where they can outperform the average human on certain creativity measures, the most creative people still demonstrate a clear and consistent advantage over even the strongest AI models, suggesting a future of augmented human creativity rather than outright replacement.

Unprecedented Scale: Benchmarking Human and AI Creative Minds

The study brought together scientists from Université de Montréal, Université Concordia, University of Toronto Mississauga, Mila (Quebec AI Institute), and Google DeepMind, creating a robust interdisciplinary framework for evaluation. Professor Karim Jerbi led this ambitious project, with postdoctoral researcher Antoine Bellemare-Pépin (Université de Montréal) and PhD candidate François Lespinasse (Université Concordia) serving as co-first authors. Crucially, the research team also included Yoshua Bengio, founder of Mila and LoiZééro, and a pioneer of deep learning—the foundational technology powering modern AI systems like ChatGPT. This collaboration underscores the seriousness and scientific rigor applied to a question that has long captivated both the scientific community and the general public: how does artificial intelligence measure up against the unique spark of human ingenuity?

To conduct this large-scale comparison, researchers evaluated several leading large language models, including prominent iterations of ChatGPT, Claude, Gemini, and others, against the creative output of more than 100,000 human participants. This extensive dataset provides an unparalleled basis for drawing conclusions about the current state of AI creativity. The findings highlight a clear turning point in AI’s developmental trajectory: some AI systems, notably GPT-4, exceeded average human scores on tasks specifically designed to measure divergent linguistic creativity.

"Our study shows that some AI systems based on large language models can now outperform average human creativity on well-defined tasks," explains Professor Karim Jerbi. "This result may be surprising—even unsettling—but our study also highlights an equally important observation: even the best AI systems still fall short of the levels reached by the most creative humans." This nuanced perspective is critical in understanding the present and future roles of AI in creative endeavors.

Further analysis by the co-first authors, Bellemare-Pépin and Lespinasse, revealed a striking pattern: while some AI models now outperform the average person, peak creativity remains firmly human. In fact, when researchers examined the most creative half of participants, their average scores surpassed those of every AI model tested. The gap grew even larger among the top 10 percent of the most creative individuals, solidifying the notion that extraordinary human creativity retains its unique edge.

"We developed a rigorous framework that allows us to compare human and AI creativity using the same tools, based on data from more than 100,000 participants, in collaboration with Jay Olson from the University of Toronto," says Professor Karim Jerbi, who is also an associate professor at Mila, emphasizing the meticulous methodology employed.

The Metrics of Imagination: How Creativity is Measured

To evaluate creativity fairly across humans and machines, the research team employed multiple methods, primarily focusing on the Divergent Association Task (DAT). The DAT is a widely used psychological test that measures divergent creativity—the capacity to generate a diverse array of original ideas from a single prompt. This task is particularly relevant as divergent thinking is a cornerstone of many creative processes, from problem-solving to artistic expression.

Created by study co-author Jay Olson, the DAT asks participants, whether human or AI, to list ten words that are as unrelated in meaning as possible. For instance, a highly creative response might include words like "galaxy, fork, freedom, algae, harmonica, quantum, nostalgia, velvet, hurricane, photosynthesis." The unrelatedness of these words indicates a broad associative network and the ability to break free from conventional thought patterns. Performance on the DAT has been consistently linked to results on other established creativity tests used in writing, idea generation, and creative problem solving, underscoring its validity as a proxy for broader creative aptitude. Although the task is language-based, it goes well beyond mere vocabulary; it engages deeper cognitive processes involved in creative thinking across various domains. The DAT also offers practical advantages, as it takes only two to four minutes to complete and can be accessed online by the general public, facilitating the collection of a vast human dataset.

Beyond these foundational word association tasks, the researchers extended their investigation to more complex and realistic creative activities. They compared AI systems and human participants on creative writing challenges such as composing haiku (a short, three-line poetic form requiring conciseness and evocative imagery), writing movie plot summaries, and producing short stories. These tasks demand not only divergent thinking but also coherence, narrative structure, and emotional resonance. The results from these more complex challenges followed a familiar pattern: while AI systems sometimes exceeded the performance of average humans, the most skilled human creators consistently delivered stronger, more original, and more compelling work, reinforcing the persistent advantage of top human talent.

The Evolution of AI Creativity: A Brief Timeline

The journey of artificial intelligence in mimicking and generating creative outputs has evolved significantly over decades. Early AI systems, often rule-based, could perform tasks that seemed creative, like composing music or generating simple poetry, but their output was largely deterministic and lacked genuine novelty. The 1960s saw programs like ELIZA, which mimicked conversational therapy, and early algorithmic art generators.

The 1980s and 90s introduced neural networks, laying the groundwork for more sophisticated pattern recognition. However, it wasn’t until the advent of deep learning in the 2010s, spearheaded by researchers like Yoshua Bengio, that generative AI began to truly blossom. Key milestones include:

Generative Adversarial Networks (GANs, 2014): Introduced by Ian Goodfellow, GANs enabled AI to generate highly realistic images, pushing the boundaries of what machines could "create."
Transformer Architecture (2017): Developed by Google Brain, this architecture revolutionized natural language processing, allowing models to understand and generate human-like text with unprecedented fluency.
GPT Series (2018-present): OpenAI’s Generative Pre-trained Transformer models, starting with GPT-1 and culminating in advanced versions like GPT-4, demonstrated increasingly sophisticated text generation, summarization, and creative writing capabilities, eventually leading to the widespread public adoption of ChatGPT.

This rapid progression set the stage for the Université de Montréal study, which meticulously measures where these advanced LLMs stand in relation to human creativity. The study’s publication in 2026 marks a crucial point, offering empirical data on the performance ceiling of current generative AI in creative domains. It moves beyond anecdotal evidence to provide a robust scientific comparison, highlighting that while AI’s general creative output has dramatically improved, the highest echelons of human originality remain untouched.

Can AI Creativity Be Adjusted? The Role of Temperature and Prompt Engineering

These findings raised another important question: Is AI creativity fixed, or can it be shaped and controlled? The study provides compelling evidence that creativity in AI is indeed adjustable, primarily by manipulating technical settings and the way instructions are crafted.

One critical parameter identified is the model’s "temperature." This setting controls how predictable or adventurous the generated responses are. At lower temperature settings, AI produces safer, more conventional, and often more coherent outputs, sticking closely to learned patterns. Conversely, at higher temperatures, responses become more varied, less predictable, and more exploratory, allowing the system to deviate from familiar ideas and generate truly novel associations. This fine-tuning capability implies that AI can be directed to produce outputs ranging from functional and conventional to wild and avant-garde, depending on the user’s creative intent.

The researchers also found that AI creativity is strongly influenced by how instructions are written—a field now commonly known as "prompt engineering." For example, prompts that encourage models to think about word origins and structure using etymology lead to more unexpected associations and higher creativity scores on tasks like the DAT. This discovery underscores a fundamental principle: AI creativity, for all its power, remains deeply reliant on human guidance. The quality and specificity of the prompts and the intelligent adjustment of parameters like temperature are central to unlocking and directing AI’s creative potential, making human-AI interaction an indispensable part of the generative creative process. This aspect highlights a future where "AI whisperers" or expert prompt engineers could become highly valued creative professionals.

Implications: A New Era of Augmented Creativity, Not Replacement

The study offers a balanced and pragmatic perspective on widespread fears that artificial intelligence could replace creative professionals en masse. While AI systems can now match or even exceed average human creativity on certain well-defined tasks, they still operate within clear limitations and rely heavily on human direction, context, and refinement. This suggests a future not of job displacement for creative roles, but of profound transformation and augmentation.

"Even though AI can now reach human-level creativity on certain tests, we need to move beyond this misleading sense of competition," says Professor Karim Jerbi. "Generative AI has above all become an extremely powerful tool in the service of human creativity: it will not replace creators, but profoundly transform how they imagine, explore, and create—for those who choose to use it."

This perspective aligns with a growing consensus among industry analysts and forward-thinking artists. Instead of signaling the end of creative careers, the findings suggest a future where AI serves as a powerful creative assistant. By rapidly generating diverse ideas, exploring unconventional avenues, and automating repetitive creative tasks, AI may help amplify human imagination rather than replace it.

Economic Impact: This shift could redefine creative industries from advertising and graphic design to writing and music composition. Businesses might leverage AI to accelerate brainstorming, generate multiple design iterations, or produce personalized content at scale. New job roles could emerge, focusing on AI supervision, ethical content generation, and prompt engineering, while existing creative roles would evolve to incorporate AI tools.
Societal Impact: The accessibility of powerful creative tools could democratize creativity, allowing individuals without specialized training to explore artistic expression. However, this also raises critical questions about authorship, intellectual property, and the potential for AI-generated content to dilute original human works or spread misinformation if not properly managed.
Artistic Impact: For artists, writers, and musicians, AI could become a tireless collaborator. Imagine a novelist using AI to generate countless plot variations or character backstories, a designer exploring hundreds of logo concepts in minutes, or a musician experimenting with novel melodic structures. The human element would then focus on selection, refinement, and imbuing the final product with unique vision and emotional depth—areas where AI still falls short.

Leading figures in the AI ethics community and cultural critics have frequently raised concerns about the "soul" of AI creativity, arguing that true originality stems from human experience, consciousness, and intent. While AI can simulate these, it does not feel or understand in the human sense. This study, while not delving into philosophical debates about consciousness, provides concrete data that reinforces the unique value of the human mind at its peak. It underscores that while AI excels at pattern recognition and generation, the highest forms of innovative, paradigm-shifting creativity remain a distinctly human domain.

"By directly confronting human and machine capabilities, studies like ours push us to rethink what we mean by creativity," concludes Professor Karim Jerbi. This ongoing re-evaluation is crucial as society navigates the rapid advancements in artificial intelligence. The research from Université de Montréal, published in Scientific Reports, serves as a vital benchmark, confirming that while AI is an increasingly powerful creative force, the human capacity for extraordinary originality continues to lead the way, setting the stage for an exciting era of collaboration and innovation.