Researchers tested AI against 100,000 humans on creativity

A landmark study spearheaded by Professor Karim Jerbi from the Department of Psychology at the UniversitÃ© de MontrÃ©al, with the notable involvement of AI luminary Yoshua Bengio, has unveiled unprecedented insights into the creative capabilities of generative artificial intelligence. Published in the esteemed journal Scientific Reports (Nature Portfolio), the research represents the most extensive direct comparison ever undertaken between human creativity and that of large language models (LLMs), fundamentally challenging long-held assumptions about the exclusive nature of human originality. The findings indicate a significant inflection point: generative AI systems can now surpass the average human on specific measures of creativity, yet the pinnacle of creative achievement remains firmly within the domain of the most gifted human minds.

The Genesis of an Unprecedented Inquiry: Pitting Human Ingenuity Against Algorithmic Innovation

The question of whether artificial intelligence can truly generate original ideas has long captivated scientists, philosophers, and the public alike. From the early days of computing, when machines could only execute predefined instructions, to the current era of sophisticated deep learning models, the perceived boundary of human creativity has been a steadfast frontier. This new study, however, meticulously designed and executed, directly confronts that frontier with an unprecedented scale and rigor. Led by Professor Karim Jerbi, an associate professor at Mila – the Quebec AI Institute – and involving researchers from UniversitÃ© Concordia, the University of Toronto Mississauga, and Google DeepMind, the collaborative effort sought to establish a definitive benchmark. The sheer volume of data, involving comparisons between leading LLMs and over 100,000 human participants, underscores the ambitious scope of this investigation. The participation of Yoshua Bengio, a Turing Award laureate and a foundational figure in deep learning, further amplifies the study’s scientific weight and its potential to reshape the discourse around AI’s creative potential.

Defining and Measuring the Elusive Spark of Creativity

To facilitate a fair and objective comparison between human and machine creativity, the research team adopted a multi-faceted approach, central to which was the Divergent Association Task (DAT). Developed by study co-author Jay Olson from the University of Toronto, the DAT is a widely recognized psychological instrument designed to assess divergent creativity – the capacity to produce a broad array of novel and distinct ideas in response to a single prompt. Unlike traditional intelligence tests that seek singular correct answers, divergent tasks probe the flexibility and originality of thought, critical components of creative problem-solving.

The DAT specifically instructs participants, whether human or AI, to list ten words that are as semantically unrelated as possible. For instance, a highly creative response might juxtapose terms like "galaxy," "fork," "freedom," "algae," "harmonica," "quantum," "nostalgia," "velvet," "hurricane," and "photosynthesis." The task measures not merely vocabulary but the ability to traverse vast conceptual distances, forging connections where none are obvious, a hallmark of imaginative thinking. Performance on the DAT has been consistently correlated with success in other established creativity assessments, including those used in creative writing, brainstorming, and artistic ideation. Its linguistic basis allows for direct application to large language models, while its underlying cognitive demands reflect broader creative processes. Furthermore, its practical advantages – requiring only two to four minutes to complete and being readily accessible online – enabled the collection of data from an exceptionally large human cohort.

AI’s Ascent: Outperforming the Average, Yet Still Trailing the Elite

The findings presented a clear turning point in the AI-creativity narrative. Researchers evaluated a spectrum of leading generative AI models, including iterations of ChatGPT (specifically GPT-4), Claude, Gemini, and others. When pitted against the scores of the more than 100,000 human participants on the DAT, a striking pattern emerged: certain AI systems, most notably GPT-4, consistently exceeded the average human scores on tasks measuring divergent linguistic creativity.

"Our study shows that some AI systems based on large language models can now outperform average human creativity on well-defined tasks," explained Professor Karim Jerbi, reflecting on the implications of this result. "This outcome may be surprising – even unsettling – for many, as it marks a significant milestone in AI’s development. However, our study also highlights an equally important observation: even the best AI systems still fall short of the levels reached by the most creative humans."

Further granular analysis by co-first authors Antoine Bellemare-PÃ©pin, a postdoctoral researcher at UniversitÃ© de MontrÃ©al, and FranÃ§ois Lespinasse, a PhD candidate at UniversitÃ© Concordia, solidified this nuanced picture. While the average person’s creative output could now be matched or even surpassed by advanced LLMs, the upper echelons of creativity remained an exclusively human domain. Specifically, when the research team isolated the most creative half of the human participants, their average scores consistently outstripped those of every AI model tested. This disparity became even more pronounced when examining the top 10 percent of the most creative individuals, whose innovative capacities demonstrated a clear and consistent advantage over even the strongest algorithmic contenders. This suggests that while AI can replicate and even generalize from vast datasets to produce novel combinations, the spark of genius, the truly groundbreaking leap, still resides primarily within the human mind.

Beyond Wordplay: AI’s Performance in Complex Creative Endeavors

The study did not limit its scope to abstract word association. The researchers also sought to determine whether AI’s success on the DAT could translate to more complex and realistic creative challenges. To this end, they tasked both AI systems and human participants with composing haikus – a concise three-line poetic form demanding imagery and succinctness – writing movie plot summaries, and crafting short stories. These tasks require not only divergent thinking but also coherence, narrative structure, emotional resonance, and a deeper understanding of human experience.

The results from these higher-order creative tasks largely mirrored the pattern observed with the DAT. While AI systems occasionally produced outputs that rivaled or even surpassed the quality of work from average human participants, the most skilled and imaginative human creators consistently delivered stronger, more original, and more compelling narratives and poetic expressions. This reinforces the idea that while AI can emulate and even synthesize creative forms, the depth of human insight, emotional intelligence, and lived experience still confers a significant advantage in areas requiring profound artistic sensibility.

Fine-Tuning the Muse: The Adjustable Nature of AI Creativity

A crucial aspect of the study explored the malleability of AI creativity. Is an AI’s creative output a fixed trait, or can it be influenced and directed? The research demonstrated that AI’s creative potential is indeed adjustable, primarily through manipulating technical parameters such as the model’s "temperature." This parameter, often misunderstood, dictates the degree of randomness or predictability in the AI’s generated responses.

At lower temperature settings (e.g., 0.2-0.5), AI models tend to produce more conservative, conventional, and predictable outputs, often sticking closer to the most probable word sequences based on their training data. This leads to safe but less original content. Conversely, at higher temperatures (e.g., 0.8-1.0), the models are encouraged to take more "risks," exploring less probable word choices and conceptual associations. This results in responses that are more varied, adventurous, and exploratory, allowing the system to deviate significantly from familiar ideas and potentially generate more unexpected and creative outcomes.

Beyond internal parameters, the study highlighted the profound impact of human guidance through prompt engineering. The way instructions are formulated can dramatically shape AI’s creative output. For example, prompts that encouraged models to delve into the etymology of words – their origins and structural components – led to more unexpected associations and significantly higher creativity scores on the DAT. This finding underscores a critical point: AI creativity is not an autonomous function but is heavily dependent on the quality and specificity of human interaction. Effective prompting becomes a central, indeed indispensable, part of leveraging AI for creative processes, turning the human operator into a skilled conductor rather than a mere recipient of AI’s output.

The Shifting Landscape: From Competition to Collaboration

The study offers a nuanced and ultimately optimistic perspective on the widespread anxieties surrounding AI’s potential to displace human creative professionals. While the findings confirm that AI systems can now match or exceed average human creativity in certain, well-defined tasks, they also unequivocally demonstrate the clear limitations of current AI, particularly its reliance on human direction and its inability to reach the pinnacle of human ingenuity.

"Even though AI can now reach human-level creativity on certain tests, we need to move beyond this misleading sense of competition," urged Professor Karim Jerbi. His sentiment reflects a broader call within the AI community to reframe the relationship between humans and AI from one of rivalry to one of synergy. "Generative AI has above all become an extremely powerful tool in the service of human creativity: it will not replace creators, but profoundly transform how they imagine, explore, and create – for those who choose to use it."

This perspective suggests a future where AI functions as a powerful creative assistant, an intellectual sparring partner, or a tireless generator of preliminary ideas. By rapidly expanding the scope of possibilities, suggesting novel directions, or even handling the more laborious aspects of content generation, AI can amplify human imagination, freeing creators to focus on higher-level conceptualization, emotional depth, and truly original thought. Instead of signaling the demise of creative careers, the research points towards an evolution, where human expertise in critical thinking, aesthetic judgment, and the nuanced understanding of human experience becomes even more valuable in guiding and refining AI-generated content. Artists might use AI to generate thousands of initial sketches, writers to brainstorm plot twists, and designers to explore countless iterations, ultimately selecting and refining the most promising ideas with their uniquely human touch.

Broader Implications: Redefining Creativity in the Age of AI

The implications of this study extend far beyond the academic realm, touching upon philosophical, ethical, and economic considerations. The ability of machines to generate seemingly original content forces a re-evaluation of what "originality" truly means. Is it the act of producing something entirely new, or the ability to synthesize existing elements in novel and meaningful ways? If AI can perform the latter with increasing sophistication, where does the unique value of human creativity lie?

Furthermore, questions of authorship and intellectual property become more complex. If an AI generates a poem or a piece of music based on human prompts, who holds the copyright? What constitutes "human" creativity when a machine is an integral part of the creative process? These are not mere academic exercises but pressing legal and ethical dilemmas that society must address as AI becomes more integrated into creative industries.

Economically, while there may be concerns about job displacement in certain routine creative tasks, the study also hints at the emergence of new roles: AI prompt engineers, AI-assisted content curators, and professionals specializing in human-AI collaborative creative workflows. Education systems may need to adapt, focusing not just on traditional creative skills but also on the meta-skills required to effectively interact with and leverage AI tools.

Ultimately, by directly confronting human and machine capabilities in a rigorous scientific framework, studies like this compel a profound societal introspection. "By directly confronting human and machine capabilities, studies like ours push us to rethink what we mean by creativity," concludes Professor Karim Jerbi. This rethinking will be crucial for navigating a future where the lines between human and artificial ingenuity become increasingly blurred, yet the unique spark of human creativity retains its irreplaceable value at the very apex of innovation.

About the Study and Its Esteemed Collaborators

The pivotal paper, titled "Divergent creativity in humans and large language models," was formally published in Scientific Reports on January 21, 2026. This extensive research effort was a testament to inter-institutional collaboration, bringing together leading scientists and institutions from across the globe. Key contributing entities included UniversitÃ© de MontrÃ©al, UniversitÃ© Concordia, the University of Toronto Mississauga, Mila (the Quebec AI Institute), and Google DeepMind, reflecting a diverse array of expertise in psychology, artificial intelligence, and cognitive science.

Professor Karim Jerbi served as the lead investigator, guiding the multidisciplinary team through this complex inquiry. The critical work of data analysis and initial interpretation was spearheaded by co-first authors Antoine Bellemare-PÃ©pin from UniversitÃ© de MontrÃ©al and FranÃ§ois Lespinasse from UniversitÃ© Concordia. A significant highlight of the research team was the inclusion of Yoshua Bengio, the visionary founder of Mila and LoiZÃ©ro. Bengio, a seminal figure in the development of deep learning – the underlying technology that powers contemporary AI systems like ChatGPT – brought unparalleled expertise and strategic insight to the study, underscoring its foundational importance in the ongoing dialogue about artificial intelligence and the future of human creativity.

Leave a Reply Cancel reply

Related News

You may have missed