How can you trust an AI model if you do not understand how it reaches its conclusions?
Starting from that question, researchers from DTU and FOSS have worked on a joint project supported by DIREC – Digital Research Centre Denmark. The project explored how to develop reliable and explainable deep learning models for analysing highly variable data, as typically found in biological materials, using grain as a case study.
Today, manual visual inspection of harvested grain remains an important quality control procedure from field to table. But the method is time-consuming, and the result depends on the human eye. If the inspection is inadequate, it can be costly for producers and ultimately affect food safety.
That is why it makes sense to develop a more automated, image-based AI solution that is both robust and interpretable.
“We have examined some of the key challenges when applying AI explanations in the real world. If we let ten different AI explanation models analyse grain today, they would produce ten different results because they are based on many different assumptions,” says Postdoc Lenka Tětková.
“During the project, we gained a much better understanding of the problem. Now we can propose a method for using and evaluating deep learning explanations when analysing biological materials with high variation. We hope this will contribute to wider adoption of these methods in both academia and industry.”
The research formed part of her PhD project in the Cognitive Systems section at DTU Compute.
Explainability is key
Although AI models can partly identify grain defects such as fungal disease pink fusarium or kernels damaged by the combine harvester (“skinned”), the project showed that models often focus on irrelevant details – such as shadows or background variations – when asked to explain their decisions.
So-called heatmaps, intended to show how the model “sees”, often highlight a mix of correct and incorrect areas. This makes it difficult for experts to assess whether the model truly understands the task.
“We were genuinely surprised to see how sensitive AI models are – even tiny variations in data could significantly change the accuracy of the AI explanation model,” says Lenka Tětková.
Biological variation requires better datasets
One of the biggest challenges is that standard datasets such as ImageNet, used to train AI models and build explanation models, do not account for large biological variation. An ImageNet photo of a cat is easy for the model to decode; the image shows only a cat, and the background is ignored. But with grain, the analysis is far more complex because shadows, colours and natural processes create variation.
FOSS, which produces testing and analysis solutions for the agricultural and food industries, has struggled with the explanation problem itself. The company has large realistic datasets for testing algorithms, which is why it joined the project.
Over several months, FOSS’s analysis department meticulously reviewed a large number of grain samples and marked damage and disease on kernel images to develop a new dataset.
Lenka Tětková used this dataset to test and compare AI explanation methods to see how well they work for image analysis of biologically variable materials.
“Collaboration with FOSS was crucial – without realistic data, the research would not have produced these new solutions,” says Lenka Tětková.
A new pipeline for research and industry
To address the explainability challenge, the researchers developed a new workflow – a guide that combines several explanation methods into one consolidated and more reliable heatmap.
The method is general and can be used to evaluate and select the best deep learning explanation model for tasks involving high biological variation. This makes it relevant not only for the grain industry but also for areas such as medicine and food safety.
“We are not saying which method is best, but we are saying that if you follow this workflow in these steps and carry out a series of specific evaluations for your data, you will find out which method works best for you,” says Lenka Tětková.
The research results have just been published in PLOS One, where other researchers can explore the workflow and build on the methods.
Building stronger bridges between research and industry
Professor Lars Kai Hansen, Head of Section in Cognitive Systems at DTU Compute, is very pleased with the collaboration with FOSS:
“The project is a real-world use case where we take AI explanation models from the lab into practice. It is an excellent example of how research and industry can work together to solve complex, real-world challenges with AI.”
The project is part of the growing bridge between research and industry, supported by organisations such as DIREC.
“It is a great example of how there is not necessarily a big gap between basic research and applied research. Some aspects of such a project require going back to fundamental properties of how AI works to create explainability that is truly useful,” says Thomas Riisgaard, Director at DIREC.
Research - PLOS ONE: Challenges in explaining deep learning models for data with biological variation, Lenka Tětková (DTU Compute), Erik Schou Dreier and Robin Malm (FOSS), Lars Kai Hansen (DTU Compute)