What comes after OpenAI’s GPT-5 or Google’s Gemini 2.5? A reasonable guess might be GPT-6 and Gemini 3 – improved models, but fundamentally more of the same. And that, according to Professor Søren Hauberg and Associate Professor Jes Frellsen from DTU Compute, is precisely the problem.
This week, they are co-organisers of the GenU – Generative models and uncertainty quantification workshop, which brings together a select group of the world’s foremost AI researchers in Copenhagen to discuss, among other things, how language models can take the next step forward.
Understand the problem: Three questions to Søren Hauberg and Jes Frellsen
1: Why have we reached a methodological impasse with GPT-5 and similar language models?
A central issue is that current generative AI models – including GPT-5 – are built upon technologies and research breakthroughs from the late 2010s. The models we use for images and text still rely on foundational research from that period.
Although these models have been refined, scaled up, and made more powerful, there are signs that we’ve reached a point of diminishing returns. If one imagines the development as a curve that has risen sharply, it is now levelling off.
It is no longer sufficient to simply add more data or computing power – the methodological tools have reached their limits. There is therefore a need to rethink how we develop and train future models.
The underlying architecture of deep neural networks is also used in other domains, such as the development of new proteins and enzymes. While text models are trained on natural language to generate sentences, articles, or dialogue, protein models like AlphaFold or ESMFold are trained on biological sequences and structural data to generate amino acid sequences or 3D structures.
Even though improvements to the underlying architecture benefit language models, they do not significantly advance other research areas where generative AI is expected to play a crucial role in the future.
Many researchers suggest that we have “milked the current cow” and that it is time to seek new pastures.
2: Why doesn’t more computing power solve the problems of generative AI?
It may seem intuitive that simply increasing computing power will yield better results from generative AI models. However, this is a misconception. The problems lie not in the size of the computer, but in the very methods by which the models are built and trained.
A key issue is that today’s models tend to hallucinate – that is, they generate responses that sound plausible but are factually incorrect. This stems from the fact that the models are trained as compression systems: they attempt to identify patterns in large datasets, but they do not understand the world. They guess.
As generative models increasingly become primary sources of information – for example, in search engines or decision support systems – it becomes a societal problem if their answers cannot be trusted. Misinformation can spread rapidly, and it becomes essential to understand and manage uncertainty in AI models – both technically and communicatively.
Therefore, it is not enough to improve chatbots. We must develop models that can be used responsibly in everything from drug development to experimental design. This requires more than just larger computers – it demands new methods, new architectures, and a deeper understanding of how models interact with reality.
From a technological perspective, one might say that designing better-preserving rye bread is the same challenge as curing cancer. It sounds absurd, but both tasks require models that can understand and predict complex systems. And that requires more than computing power – it requires intelligence, responsibility, and innovation.
3: How can we rethink the future of generative AI – and who should help shape it?
Progress demands interdisciplinary innovation. Mathematicians must help define and measure uncertainty, computer scientists must develop new training methods, and engineers must rethink the architectures. It is not merely about improving chatbots – it is about creating models that can be used in everything from medicine to experimental design.
Workshops such as the upcoming one in Copenhagen bring together researchers from around the world to discuss, through a multidisciplinary approach, how we move forward with the development of generative AI – and perhaps set the direction for its future.
This is fundamental research, so the focus is not on how generative AI will be methodologically structured over the next five years – although that may seem a long time in a rapidly evolving field. Rather, we are looking 10 to 20 years ahead.
The problem is that we are increasingly becoming dependent on generative models. One can hardly perform a Google search without being confronted by a generative model. And if our primary method of information retrieval is based on compressibility we risk becoming a less informed society. This is not (necessarily) due to malicious intent. It is simply a mathematical consequence of the way we have chosen to train these models.
So, what now? We must find a new way to train these models.
One possible path forward is for models to indicate how certain or uncertain their responses are. There are many statistical techniques that could, in principle, be used for this.
Today, these models are not fact machines – they are language machines. This means they produce semantic variations, swapping one word for another with similar meaning, creating slight differences in phrasing. In practice, a model may say two plus two equals five simply because it's semantically likely, not correct. Producing factual accuracy is far more difficult.
We do not yet have the answer. If we did, there would be no need to hold the workshop – we could simply build the solution.
We need mathematicians, engineers, and computer scientists to clean up the chaos and find new ways to train these models.
Read also:
Denmark gathers the AI elite to strengthen Europe's research environment, published in September 2022.