Medical Data mining diagram
‘What are the most common reasons for an examination?’ Answer?
Courtesy of Empolis

Article • The potential insights are invaluable; we should not waste this source

Medical data mining

The treasure trove of healthcare data waiting to be explored in German hospitals is immense and could provide invaluable insights. However, what about data security and privacy? Andreas Klüter, CTO of Empolis Information Management GmbH, a new business entry in healthcare IT, spoke with European Hospital about medical text mining and the need for ethics discussion.

Andreas Klüter
Andreas Klüter

Thirty years ago Empolis Information Management GmbH began its role in smart data processing and service optimisation. Giving the example of involvement with call centres, Andreas Klüter, CTO of Empolis said, ‘We developed software that provides decision trees for call centre staff to help them get straight to the customer’s problem and its solution. Our vision is that “no one must ever make wrong decisions again” and our new mission is derived from this vision: “utilise all information to provide the right recommendations”.’

‘Text mining and linguistics are the tools of our trade, also in healthcare. We developed a solution that retrospectively analyses free text medical reports, using a number of criteria. We do this with the help of mature artificial intelligence technologies, such as deep learning or case-based reasoning. Our partner Smart Reporting contributes the clinical process know-how. We fused their know-how and our technology in their prototype module called Smart Radiology. 

‘Now we can partially structure unstructured data. So far this works with existing reports that we analyse retrospectively. However, we are in the process of developing a prototype that hints at which type of data might be missing, in order to arrive at a complete or guideline-compliant diagnosis of a certain pathology during the process of gathering findings. This might help to achieve a much higher degree of standardisation in findings and clinical reports.’ 

‘Our analysis is based on approximately 150,000 anonymised reports, focusing on the 40,000 brain CTs included in these reports. Our aim was to determine the level of quality of the findings, to figure out whether certain trends are discernible and whether the different hospitals have different referral and requirement patterns for imaging procedures. However, we do not intend to conduct further studies.’ 

‘While we initially focused on brain CTs to create a knowledge model that allows us to analyse the data, we do plan to cover all anatomies, step by step. In addition we want to analyse the results of other imaging modalities such as MR scans.’ 

‘Data security and privacy are immensely important issues. Therefore only the study principals receive the results and they decide how the data will be used,’ he explained. ‘We can show trends, but it is not for us to decide whether a trend indicates a problem.’ 

‘We need this debate on artificial intelligence from the very beginning. However: In my opinion the computer cannot do everything better and it won’t be able to do everything better, even though it can perform increasingly complex tasks. 

Multi-stage text analysis
Multi-stage text analysis
Courtesy of Empolis

‘There was a very telling experiment recently where artificial intelligence was used to “train” a computer in Shakespearean language and then the computer was asked to write a book. The result: The machine’s choice of words was indeed rather “Shakespearean” but the text was completely devoid of meaning. That clearly shows the current stage of AI. 

‘Having said that, there are advances, and we need to discuss how we are going to deal with the new insights and which approach we will choose. It’s a long process for a society to agree on a path, but this consensus is necessary and we have to embark on this journey now. To do nothing, I’m sure, is the wrong decision. 

‘The archives of German hospitals are full of text and image data waiting to be used, data that might really advance clinical research. The technological obstacles are surmountable today, the potential insights are invaluable. We should not waste this source.’


For over 20 years, Andreas Klüter, CTO of Empolis Information Management GmbH, has focused on developing systems for intelligent information processing. From 1994 on he was instrumental in realising the “Verbmobil”, the worldwide first research prototype for fully automated translation of spoken language at the German Research Centre for Artificial Intelligence. During his tenure as Head of Development at ORBIS, he gathered profound knowledge of healthcare IT. As CTO of Empolis Information Management GmbH Andreas Klüter is in charge of the company‘s product portfolio and the business division eHealth.


Read all latest stories

Related articles


Article • Transformative technology

Generative AI in healthcare: More than a chatbot

‘Computer, why did the doctor take that MRI scan of my leg? And what did it show?’: Popularized by OpenAI’s ChatGPT, generative artificial intelligence (AI) is already beginning to see…


News • Concerns over new law

How would a WhatsApp ban for doctors affect patient care?

UK law changes threaten the security of messaging apps – and their use in the NHS. Doctors warn that patient care will suffer if they can no longer use apps such as WhatsApp to share information.


Article • Experts explore impact of technology

AI in radiology: helper or bane of society and the environment?

The climate crisis and AI – arguably two of the most hotly-debated and relevant topics of our time – share an intricate relationship: While computation of complex AI routines commands an immense…

Related products

Subscribe to Newsletter