Matteo Palmonari on How Data Becomes Meaningful through AI
In this interview, the STELAR project welcomes Matteo Palmonari, Associate Professor at the University of Milano-Bicocca. With expertise in data management, integration, and AI – particularly in knowledge graphs, semantic annotations, and data linking – his work strongly aligns with STELAR’s mission to enhance data discovery and integration.
As a consortium member of the enRichMyData project, which is also STELAR’s sister project, he brings valuable insights into innovative methods for improving data management and decision-making. Through this discussion, he shares his perspective on the challenges and opportunities in this space.
Exploring Our Guest's Motivation and Background
Could you start by telling us a bit about your career journey and what inspired you to specialise in Artificial Intelligence (AI) and data management ?
My career began with a PhD in symbolic AI (qualitative reasoning) within an AI lab. After completing my PhD, I joined a data management group, where I started working on data and semantic services. Initially, I focused on modelling tasks such as ontologies, before shifting my attention to matching and data annotation within the context of the semantic web and knowledge graphs. Over the past eight years, I have also explored NLP, particularly as a means to interlink data and build knowledge bases.
What excites you most about the potential of AI and data science to solve real-world challenges, and how has that shaped your professional path?
Much of my work has centered around projects where enriching and interlinking data unlocks access to vast amounts of integrated knowledge. This process adds significant value by transforming fragmented or limited information into a more cohesive and insightful whole. Most recent AI advancements have introduced techniques that make data enrichment and integration simpler and, especially, have made it possible to interact with knowledge in natural language.

Understanding AI, Data Integration, and Knowledge Graphs
Could you explain your role in enRichMyData and the main objectives of the project, particularly in terms of data integration and linking?
I am the Scientific Manager of the project and was one of its core proposers. As a research unit, our main contributions focus on developing techniques for annotating, reconciling, and enriching tabular data – transforming them into knowledge graphs or supporting tabular data augmentation exploiting knowledge graphs in the background.
What are some of the key technical challenges you have encountered in data management, and how have you addressed them in your work?
Some key challenges I have faced include:
- enriching data with third-party sources, which first need to be identified, assessed, and understood before use;
- interlinking data from diverse sources;
- using graph data models proficiently.
More broadly, I have observed that many companies lack the resources to experiment with new solutions and develop a long-term technological strategy.
How do knowledge graphs and semantic annotations contribute to improving data discovery and integration, and what impact can they have on decision-making?
As someone who has long advocated for the use of knowledge graphs and semantic annotations, I may be a bit biased in answering this question. However, I see two key ways in which they enhance data discovery and integration.
- First, they promote an entity-centric approach to data management, where entity identifiers serve as keys to unlocking heterogeneous data associated with those entities. Once you connect to an entity, you gain access to all relevant information linked to it. We introduced the link & extend pattern to support data enrichment: you link your data using shared identifiers – such as those provided by a knowledge graph – and then leverage these identifiers to retrieve additional data. This process, which relies on knowledge graphs in the background, is valuable even when there is no need to structure or publish the data in graph format.
- Second, they enable data to be organised into graphs, capturing the core relational knowledge that underpins much of analytical and associative thinking. Growing evidence suggests that this graph-based structure not only facilitates AI-driven tasks but also improves the development of natural language interfaces for interacting with data.

In your opinion, what is the most important aspect of data linking, especially when working with large, complex datasets such as those in the STELAR and enRichMyData projects?
Three key aspects need to be balanced:
- Scale – Data linking is not just about algorithmic accuracy; it also requires handling large volumes of data efficiently.
- Human-in-the-loop – In many real-world scenarios, a subset of links may need to be verified by humans to ensure accuracy.
- Generalisation and adaptability – Each data linking problem has unique characteristics, so it is essential for algorithms to be both broadly applicable across different cases and adaptable with little to no training data.
These aspects are particularly interesting because LLM-based approaches and more traditional methods may offer complementary strengths in tackling them.
Could you give us an example of how AI and Machine Learning have been applied to improve data management or processing workflows in your projects?
In developing entity linking algorithms for tabular data, we observed two key patterns.
- First, neural networks excel at learning how to combine different matching features that are traditionally used for linking.
- Second, LLMs demonstrate remarkable entity disambiguation skills, largely due to their extensive parametric knowledge. This allows them to reconcile records in cases where traditional machine learning approaches struggle – especially when there are insufficient explicit signals, such as disambiguating company names without additional contextual information (however, when signals are abundant, well-trained feature-based ML algorithms may still do better than very large models).
The Future of AI: What Lies Ahead
Finally, looking ahead, what are your thoughts on the future of data integration, AI, and knowledge graphs? How do you see these technologies evolving in the coming years?
I believe that language models, particularly LLMs, will play a key role in enhancing data integration by leveraging knowledge graphs as an abstraction layer. Knowledge graphs and data integration technologies complement generative AI exceptionally well. While our ability to process unstructured data has improved significantly, many critical domains will still require knowledge to be structured in formats that can be reliably accessed and modified – even when generated with AI. At the same time, there is growing evidence that generative AI is driving an increasing demand for knowledge graphs and techniques for building structured knowledge bases that AI agents can effectively consume.
Conclusion
The insights shared by Matteo Palmonari highlight the evolving landscape of AI-driven data management and integration. His work with knowledge graphs and LLMs shows how AI improves data enrichment, structure, and interaction. As AI and data integration continue to advance, projects like enRichMyData and STELAR contribute to better data discovery and decision-making, bridging the gap between unstructured and structured data for practical applications.
Follow our Blog and LinkedIn page for future interviews with data professionals.