Generative AI Systems Match Average Human Creativity, While Top Human Talent Maintains Significant Edge

A groundbreaking study led by Professor Karim Jerbi from the Department of Psychology at the Université de Montréal, featuring the participation of renowned AI pioneer Yoshua Bengio, has delivered a comprehensive and unprecedented direct comparison between human creativity and that of large language models (LLMs). Published in Scientific Reports (Nature Portfolio), the research involved over 100,000 human participants, establishing it as the largest-scale investigation into this evolving frontier. The findings reveal a pivotal moment in the development of artificial intelligence: generative AI systems have now advanced to a level where they can outperform the average human on specific measures of creativity. Crucially, however, the study simultaneously affirms that the most creative individuals within the human population retain a distinct and consistent advantage over even the most sophisticated AI models.

The Unprecedented Scale of Inquiry into Digital Ingenuity

The question of whether machines can truly be creative has long been a subject of philosophical debate and scientific inquiry. Historically, creativity was considered a uniquely human attribute, deeply intertwined with consciousness, emotion, and lived experience. The advent of generative artificial intelligence, particularly large language models like OpenAI’s ChatGPT, Google’s Gemini, and Anthropic’s Claude, has challenged these traditional notions, producing outputs ranging from compelling prose and poetry to intricate code and artistic visuals. However, empirical, large-scale comparisons between human and machine creativity have remained scarce, often limited by scope or methodology.

This new study addresses that gap with remarkable breadth and rigor. Under the leadership of Professor Karim Jerbi, an associate professor at Mila – Quebec AI Institute, the research brought together a multidisciplinary team from Université de Montréal, Université Concordia, University of Toronto Mississauga, and Google DeepMind. The involvement of Yoshua Bengio, a Turing Award laureate and a foundational figure in deep learning, underscores the scientific weight and ambition of the project. Their objective was not merely to observe AI’s creative outputs but to systematically measure and compare them against a vast human benchmark, employing standardized psychological tools. The sheer number of human participants—exceeding 100,000—provides an unparalleled dataset for robust statistical analysis, lending significant credibility to the study’s conclusions.

Methodology: Decoding Creativity Across Species

To ensure a fair and direct comparison between humans and machines, the research team meticulously designed a methodological framework utilizing established psychometric tests. The primary instrument for evaluating divergent linguistic creativity was the Divergent Association Task (DAT), co-created by study co-author Jay Olson from the University of Toronto. The DAT is a widely recognized psychological test that assesses an individual’s ability to generate diverse and original ideas from a single prompt, a key component of creative thinking.

In the DAT, participants—whether human or AI—are asked to list ten words that are as semantically unrelated to each other as possible. The task measures not just the quantity of ideas, but their uniqueness and distance from common associations. For instance, a highly creative response might include a list like "galaxy, fork, freedom, algae, harmonica, quantum, nostalgia, velvet, hurricane, photosynthesis." Such a combination demonstrates an ability to break free from conventional thought patterns and explore disparate conceptual spaces. The effectiveness of the DAT lies in its simplicity yet its profound ability to tap into broader cognitive processes involved in creative ideation across various domains, not just vocabulary. Moreover, its brief completion time (two to four minutes) and online accessibility facilitated the unprecedented scale of human participation.

Beyond the DAT, the researchers extended their evaluation to more complex and realistic creative tasks. They challenged both AI systems and human participants to engage in creative writing exercises, including composing haiku (a concise, three-line poetic form), crafting movie plot summaries, and writing short stories. These tasks moved beyond simple word association to assess narrative coherence, imaginative development, and the ability to evoke emotion or convey complex ideas—facets traditionally considered hallmarks of human creativity. The consistent application of these diverse metrics across both human and AI subjects provided a holistic view of creative capabilities.

The Turning Point: AI’s Ascent to Average Human Levels

The core findings from the DAT revealed a significant milestone in AI development. The researchers evaluated several leading large language models, including iterations of ChatGPT (like GPT-4), Claude, and Gemini, against the performance of their vast human participant pool. The results demonstrated that certain AI systems, notably GPT-4, successfully exceeded the average human scores on tasks designed to measure divergent linguistic creativity. This outcome suggests a fundamental shift in AI’s capacity to generate novel and varied ideas, moving beyond mere data regurgitation or pattern recognition.

Professor Karim Jerbi articulated the gravity of this finding: "Our study shows that some AI systems based on large language models can now outperform average human creativity on well-defined tasks. This result may be surprising — even unsettling." Indeed, for many, the idea of a machine matching or even surpassing human ingenuity in an area as nuanced as creativity might challenge deeply held beliefs about human exceptionalism. The rapid progression of AI, particularly in the last decade, from rule-based systems to highly adaptable generative models, has led to this inflection point, demonstrating a sophisticated ability to synthesize, associate, and produce outputs that exhibit a level of originality previously confined to human intellect.

The Enduring Human Edge: Peak Creativity Undisturbed

Despite AI’s impressive ascent, the study delivered an equally compelling and reassuring observation for human proponents: the highest echelons of creativity remain firmly within the human domain. As Professor Jerbi further emphasized, "even the best AI systems still fall short of the levels reached by the most creative humans." This finding was corroborated by further analysis conducted by co-first authors Antoine Bellemare-Pépin (Université de Montréal) and François Lespinasse (Université Concordia).

When the researchers segmented the human participant data, they discovered a striking pattern. The average scores of the most creative half of the human participants consistently surpassed those of every AI model tested. This gap widened even further when examining the top 10 percent of the most creative individuals, whose performance demonstrated a clear and significant advantage over even the most advanced AI systems. This suggests that while AI can replicate and even improve upon average creative output, the spark of genius, the truly groundbreaking idea, or the profoundly moving artistic expression still emanates from human consciousness. It points to a qualitative difference in the depth, nuance, and perhaps the underlying motivation of human creativity that current AI models have yet to emulate.

This pattern extended beyond the linguistic association tasks to the more complex creative writing challenges. While AI systems occasionally outperformed average humans in composing haiku, drafting movie plots, or writing short stories, the most skilled human creators consistently delivered work that was not only technically proficient but also richer in originality, emotional depth, and narrative sophistication. This reinforces the notion that true human creativity often involves a blend of cognitive abilities, emotional intelligence, and lived experiences that AI, in its current form, cannot fully replicate.

The Nuance of AI Creativity: Tunable Parameters and Human Guidance

One of the fascinating insights gleaned from the study concerns the malleable nature of AI creativity. The research demonstrated that AI’s creative output is not fixed but can be significantly adjusted by manipulating technical settings, most notably the model’s "temperature" parameter. In the context of large language models, temperature controls the randomness and variability of the generated output.

At lower temperature settings, AI models tend to produce more predictable, conventional, and "safe" responses, adhering closely to the patterns they have learned from their training data. As the temperature is increased, the models become more adventurous, generating responses that are more varied, less predictable, and more exploratory. This allows the system to deviate further from familiar ideas and venture into more novel conceptual territory, thereby increasing its perceived creativity. This discovery offers a powerful lever for developers and users to fine-tune AI’ for specific creative objectives, whether seeking reliable and consistent outputs or aiming for groundbreaking and unexpected results.

Beyond internal parameters, the study highlighted the critical role of human guidance in shaping AI creativity. The way instructions are formulated—known as "prompt engineering"—exerts a strong influence on the models’ creative performance. For example, prompts that encouraged the AI models to consider word origins and structure through etymology led to more unexpected associations and consequently, higher creativity scores on the DAT. This finding underscores a crucial partnership: AI creativity, far from being autonomous, is heavily dependent on the quality and specificity of human input. This interaction and the iterative process of prompting become central to unlocking AI’s creative potential, positioning humans as orchestrators and collaborators rather than mere observers.

A Historical Perspective on AI and Creativity

The journey of artificial intelligence from its conceptualization in the mid-20th century to the sophisticated generative models of today has been marked by continuous advancements and shifting paradigms. Early AI systems, often symbolic and rule-based, struggled with tasks requiring intuition or creativity. The advent of machine learning, and subsequently deep learning, in the late 20th and early 21st centuries, brought about a revolution. Neural networks, trained on vast datasets, began to exhibit capabilities previously thought impossible for machines, from image recognition to natural language processing.

The development of transformer architectures and the subsequent rise of large language models like GPT-3, and its successors, represented another monumental leap. These models, trained on trillions of words and extensive datasets, developed an uncanny ability to understand context, generate coherent text, and even mimic various writing styles. Yet, questions persisted regarding the true "originality" of their output, with many critics arguing that AI merely recombines existing data without genuine understanding or intention.

This Université de Montréal study represents a significant empirical milestone in this historical narrative. It moves beyond theoretical discussions to provide concrete, quantitative evidence of AI’s creative capabilities. By directly comparing AI against human performance on standardized tests, it offers a tangible measure of how far AI has come, placing it firmly within the realm of creative agents, albeit with distinct characteristics and limitations compared to humans. The involvement of pioneers like Yoshua Bengio, whose work laid much of the groundwork for modern deep learning, further solidifies the study’s position as a landmark in the ongoing evolution of AI research.

Implications for the Creative Economy, Education, and Society

The findings of this study carry profound implications for various sectors, particularly the creative economy and education. The long-standing fear that artificial intelligence could replace creative professionals—artists, writers, designers, musicians—has often been a source of anxiety. While AI systems can now match or exceed average human creativity on certain tasks, Professor Jerbi’s perspective offers a balanced and hopeful outlook.

"Even though AI can now reach human-level creativity on certain tests, we need to move beyond this misleading sense of competition," says Professor Karim Jerbi. "Generative AI has above all become an extremely powerful tool in the service of human creativity: it will not replace creators, but profoundly transform how they imagine, explore, and create — for those who choose to use it." This perspective champions a future of augmentation rather than replacement. AI could serve as an invaluable creative assistant, capable of generating rapid ideations, exploring diverse stylistic variations, or even overcoming creative blocks. A writer might use an LLM to brainstorm plot twists, a designer to generate logo variations, or a musician to experiment with harmonic progressions. This collaborative paradigm could amplify human imagination, enabling creators to push boundaries and explore new artistic territories with unprecedented efficiency.

In the realm of education, these findings necessitate a reevaluation of curricula and pedagogical approaches. Future generations of students will need to be proficient not only in traditional creative skills but also in the art of human-AI collaboration. Understanding how to effectively prompt AI, interpret its outputs, and refine them with human insight will become crucial competencies. Educators might focus on fostering critical thinking, ethical considerations surrounding AI-generated content, and the unique human capacities that AI cannot replicate, such as emotional depth, contextual understanding, and intentionality.

The broader societal implications also warrant consideration. As AI-generated content becomes more prevalent, questions of authorship, intellectual property, and authenticity will become increasingly complex. Distinguishing between human-created and AI-created content, particularly as AI sophistication grows, will pose new challenges. Moreover, the ethical imperative to use AI responsibly, avoiding bias perpetuation or the generation of harmful content, remains paramount.

Rethinking Creativity Itself in the Age of AI

Ultimately, this study compels us to fundamentally rethink our understanding of creativity. As Professor Jerbi concludes, "By directly confronting human and machine capabilities, studies like ours push us to rethink what we mean by creativity." Is creativity solely about generating novel ideas, or does it also encompass intention, consciousness, and the unique human experience of inspiration? If an AI can produce a poem indistinguishable from one written by a human, does it possess creativity in the same sense?

The study’s findings suggest that creativity is perhaps not a monolithic trait but a multifaceted construct. AI demonstrates a powerful capacity for divergent thinking and idea generation based on statistical learning and pattern recognition. Human creativity, especially at its peak, appears to integrate these cognitive abilities with intuition, emotional resonance, and a deeper understanding of the human condition. The future of creativity may therefore lie not in a competition between humans and machines, but in a symbiotic relationship where each augments the other, leading to new forms of artistic expression and problem-solving yet unimagined.

About the Study

The paper titled "Divergent creativity in humans and large language models" was officially published in Scientific Reports on January 21, 2026. This landmark research was a collaborative effort, drawing expertise from leading institutions including Université de Montré, Université Concordia, University of Toronto Mississauga, Mila (Quebec AI Institute), and Google DeepMind. The study was spearheaded by Professor Karim Jerbi, with Antoine Bellemare-Pépin from Université de Montréal and François Lespinasse from Université Concordia serving as co-first authors. A significant contributor to the research team was Yoshua Bengio, the esteemed founder of Mila and LoiZéro, whose pioneering work in deep learning has been instrumental in the technological advancements underpinning modern AI systems such as ChatGPT.

Leave a Reply Cancel reply

Related News

You may have missed