AI's Creative Leap: Landmark Study Reveals Generative Systems Outperform Average Humans, But Peak Creativity Remains Firmly Human

A groundbreaking new study, spearheaded by Professor Karim Jerbi from the Department of Psychology at the Université de Montréal and featuring the insights of renowned AI pioneer Yoshua Bengio, has unveiled a significant shift in the landscape of artificial intelligence and human creativity. This research represents the largest direct comparison ever conducted between the creative capacities of human beings and those of large language models (LLMs), offering unprecedented scale and depth to a debate that has long captivated scientists, artists, and the public alike. The findings, published in the esteemed journal Scientific Reports (part of the Nature Portfolio), indicate that generative AI systems have reached a crucial inflection point, now capable of outperforming the average human on specific creativity measures. However, the study also emphatically reaffirms that the most creative individuals still maintain a clear and consistent advantage over even the most advanced AI models.

The Ascent of AI: Reaching Average Human Creative Benchmarks

The research team meticulously evaluated several leading large language models, including prominent systems such as OpenAI’s ChatGPT (specifically GPT-4), Anthropic’s Claude, and Google’s Gemini, among others. Their performance was then rigorously compared against a vast dataset derived from over 100,000 human participants, establishing a robust baseline for comparison. The core revelation from this extensive analysis is a clear turning point in the developmental trajectory of AI. Certain sophisticated AI systems, particularly GPT-4, demonstrated the ability to exceed average human scores on tasks specifically designed to gauge divergent linguistic creativity. This marks a notable milestone, illustrating the rapid advancements in AI’s capacity to generate novel and varied ideas, a hallmark of creative thought.

Professor Karim Jerbi articulated the dual nature of these findings, stating, "Our study shows that some AI systems based on large language models can now outperform average human creativity on well-defined tasks. This result may be surprising—even unsettling—but our study also highlights an equally important observation: even the best AI systems still fall short of the levels reached by the most creative humans." This nuanced perspective underscores the complexity of defining and measuring creativity, acknowledging AI’s impressive progress while firmly establishing the enduring superiority of human ingenuity at its highest echelons.

Further intricate analysis, meticulously carried out by the study’s co-first authors, postdoctoral researcher Antoine Bellemare-Pépin from the Université de Montréal and PhD candidate François Lespinasse from Concordia University, illuminated a striking and consistent pattern. While a subset of AI models has indeed surpassed the creative output of the average person, the pinnacle of creative achievement unequivocally remains a human domain. This critical distinction provides a more granular understanding of AI’s current capabilities, suggesting a landscape where AI excels at broad-based creative generation but struggles to match the truly novel and profound insights characteristic of elite human creators.

Indeed, when the researchers focused their attention on the most creatively adept half of the human participants, their average scores consistently eclipsed those of every single AI model subjected to testing. This disparity became even more pronounced and statistically significant when the analysis zeroed in on the top 10 percent of the most creative individuals, demonstrating a substantial and persistent gap that AI has yet to bridge. "We developed a rigorous framework that allows us to compare human and AI creativity using the same tools, based on data from more than 100,000 participants, in collaboration with Jay Olson from the University of Toronto," Professor Jerbi, who is also an associate professor at Mila, elaborated, emphasizing the methodological integrity that underpins these compelling conclusions.

Chronology of AI’s Creative Evolution and the Study’s Genesis

The journey towards AI systems capable of exhibiting what might be termed "creativity" has been a protracted one, evolving from rudimentary rule-based programs in the mid-20th century to the sophisticated neural networks of today. Early attempts at computational creativity often involved algorithms generating music or poetry based on predefined rules, producing outputs that were technically correct but largely devoid of genuine originality or emotional resonance. The paradigm shift began in the early 2010s with the rise of deep learning, a subfield of machine learning inspired by the structure and function of the human brain. Pioneers like Yoshua Bengio, a co-author of this landmark study and a founder of Mila (Quebec AI Institute), were instrumental in developing the foundational architectures, such as recurrent neural networks and later transformers, that power modern large language models.

The release of models like GPT-3 in 2020 marked a qualitative leap, demonstrating unprecedented fluency and coherence in text generation. This was followed by more advanced iterations and competing models from various tech giants, leading to the public phenomenon of ChatGPT in late 2022. This exponential growth in capability naturally spurred questions about AI’s potential in creative domains, moving beyond mere content generation to genuine ideation.

The study, formally titled "Divergent creativity in humans and large language models," was published in Scientific Reports on January 21, 2026. Its conception stems from the increasing urgency to empirically test these emergent capabilities. As generative AI became more pervasive, the academic community recognized the need for large-scale, methodologically sound comparisons to move beyond anecdotal evidence or theoretical speculation. This research provides a critical benchmark, setting the stage for future investigations into the evolving relationship between human and artificial intelligence. The collaboration between institutions like Université de Montréal, Concordia University, University of Toronto Mississauga, Mila (Quebec AI Institute), and Google DeepMind highlights the interdisciplinary and multi-institutional effort required to tackle such complex questions at the forefront of AI research.

Methodology: Unpacking How Creativity is Measured in Humans and Machines

To ensure a fair and equitable evaluation of creativity across both human participants and machine intelligence, the research team employed a multi-faceted approach, prioritizing established psychological tests. The primary instrument utilized was the Divergent Association Task (DAT), a widely recognized and validated psychological test designed to measure divergent creativity. Divergent creativity is defined as the ability to generate a broad range of unique, original, and diverse ideas from a single prompt or starting point, a fundamental component of innovative thinking.

The DAT, ingeniously created by study co-author Jay Olson from the University of Toronto, presents a straightforward yet powerful challenge: participants, whether human or AI, are asked to list ten words that are as semantically unrelated in meaning as possible. The strength of this task lies in its ability to quantify the conceptual distance between generated words, thereby providing a numerical proxy for originality and the breadth of associative thinking. For instance, a highly creative response might include a disparate collection of words such as "galaxy, fork, freedom, algae, harmonica, quantum, nostalgia, velvet, hurricane, photosynthesis." The sheer variety and conceptual distance between these terms indicate a high degree of divergent thinking, moving beyond obvious or common associations.

Performance on the DAT has been consistently shown to correlate strongly with results on other well-established creativity tests across various domains, including written expression, general idea generation, and complex creative problem-solving scenarios. While the task is inherently language-based, its demands extend far beyond mere vocabulary recall. It actively engages broader cognitive processes critical to creative thinking, such as conceptual blending, remote association, and the ability to break free from conventional thought patterns. Furthermore, the DAT offers significant practical advantages, requiring only two to four minutes to complete and being readily accessible online to a broad public audience, facilitating the collection of the massive dataset required for this study.

Beyond the core DAT, the researchers embarked on a crucial follow-up investigation: exploring whether AI’s success on this relatively simple word association task could translate effectively to more complex and ecologically valid creative activities. To address this, they pitted AI systems against human participants in a series of creative writing challenges. These included the composition of haiku (a traditional Japanese poetic form characterized by a 5, 7, 5 syllable structure across three lines), the crafting of succinct yet engaging movie plot summaries, and the creation of short stories. These tasks demand not only divergent ideation but also narrative coherence, stylistic flair, and an understanding of human emotion and context—elements traditionally considered exclusive to human creativity.

The results from these more complex tasks mirrored the pattern observed with the DAT. While AI systems occasionally managed to surpass the performance of average humans, particularly in terms of sheer output volume or grammatical correctness, the most skilled and imaginative human creators consistently delivered work that was not only stronger in quality but also demonstrably more original, emotionally resonant, and conceptually sophisticated. This reinforced the notion that while AI can generate plausible creative outputs, it often lacks the deeper understanding, lived experience, and unique perspective that define true human artistic expression.

The Adjustable Nature of AI Creativity: Temperature and Prompt Engineering

The study also delved into a pivotal question regarding the nature of AI’s creative output: Is AI creativity an immutable, fixed characteristic, or can it be consciously shaped and influenced? The findings unequivocally demonstrate that creativity in AI systems is, in fact, highly adjustable and can be significantly modulated by altering specific technical settings, most notably the model’s "temperature" parameter.

The temperature parameter in large language models serves as a control mechanism for the predictability and adventurousness of the generated responses. At lower temperature settings (e.g., closer to 0), the AI model operates with a higher degree of determinism, favoring the most probable and conventional next words or ideas based on its training data. This leads to outputs that are generally safer, more predictable, and often more coherent, but less likely to be truly novel or surprising. Conversely, when the temperature is increased (e.g., closer to 1 or higher), the model introduces a greater element of randomness and exploratory behavior. This allows the AI to venture beyond familiar or statistically probable ideas, leading to responses that are more varied, less predictable, and often more experimental or "out-of-the-box." The researchers found that optimizing this temperature setting was crucial for eliciting higher creativity scores from the AI models on the divergent tasks.

Beyond internal technical parameters, the study highlighted another profound influence on AI creativity: the quality and nature of human instructions, often referred to as "prompt engineering." The way prompts are formulated can dramatically alter the AI’s creative trajectory. For example, the research team discovered that prompts explicitly encouraging models to consider word origins and structural relationships using etymology led to a significant increase in unexpected associations and, consequently, higher creativity scores. By guiding the AI to think about the roots and evolution of language, humans could effectively unlock more novel and less conventional connections within the model’s vast knowledge base. These results underscore a critical insight: AI creativity is not an autonomous phenomenon but rather a highly dependent one, intricately intertwined with human guidance and interaction. The quality of human input, therefore, becomes a central and indispensable component of the AI-driven creative process.

Implications for the Future: AI as a Creative Amplifier, Not a Replacement

The findings of this comprehensive study offer a nuanced and balanced perspective, directly addressing the widespread anxieties and fears that artificial intelligence could ultimately supplant human creative professionals. While the research undeniably demonstrates that AI systems can now match or even exceed average human creativity on certain well-defined tasks, it also highlights their persistent limitations and their fundamental reliance on human direction and context.

Professor Karim Jerbi articulated this crucial distinction, urging a shift in perspective: "Even though AI can now reach human-level creativity on certain tests, we need to move beyond this misleading sense of competition. Generative AI has above all become an extremely powerful tool in the service of human creativity: it will not replace creators, but profoundly transform how they imagine, explore, and create—for those who choose to use it." This statement reframes the narrative from one of displacement to one of augmentation and collaboration.

Rather than portending the obsolescence of creative careers, the study’s conclusions strongly suggest a future where AI functions as an indispensable creative assistant. By efficiently generating a multitude of ideas, exploring diverse conceptual pathways, and quickly iterating on variations, AI can significantly expand the ideational landscape available to human creators. This capacity to rapidly prototype and explore novel avenues may help amplify human imagination, freeing up creators to focus on the more nuanced, emotionally intelligent, and uniquely human aspects of artistic and intellectual endeavor, rather than getting bogged down in repetitive or less inspired tasks.

The involvement of Yoshua Bengio, a luminary in deep learning whose work has fundamentally shaped the landscape of modern AI, further lends weight to this perspective. His continued engagement with research exploring the boundaries of AI capabilities underscores the scientific community’s commitment to understanding AI’s societal impact. It is logical to infer that Bengio, a proponent of ethical AI development, would also view these systems as tools designed to extend human potential rather than diminish it. His perspective often emphasizes AI as a collaborative intelligence, capable of solving complex problems in partnership with humans.

Ultimately, this landmark research serves as a powerful catalyst for re-evaluating our understanding of creativity itself. "By directly confronting human and machine capabilities, studies like ours push us to rethink what we mean by creativity," Professor Jerbi concluded. As AI continues its rapid evolution, the ongoing dialogue between human ingenuity and artificial intelligence will not only redefine creative processes but also deepen our appreciation for the distinct and invaluable qualities of human imagination and consciousness. The future of creativity, it seems, is not one of AI versus humans, but rather one of AI with humans, forging new frontiers of innovation and expression.