Prof. Dragan Boskovic in Spotlight for STELAR: On the Importance of Data Formats in ICT
In our very first episode of the STELAR project’s Data Stories 360° podcast, we have prepared a set of questions for Prof. Dragan Boskovic – senior data technology expert and innovator who holds roles as a Clinical Professor at the W. P. Carey Business School, Arizona State University, and as the Founder of “Fondacija VizLore Labs.”
Julia Mitrovic, Dissemination and Communication Manager from Foodscale Hub, one of the partners on the STELAR project, interviewed our first guest, as Prof. Boskovic is someone who is very compatible with what we are doing here in the STELAR project.
Stick around to hear our insightful conversation, where we will explore questions such as how data management principles can be applied to improve agricultural processes, as well as the challenges encountered when dealing with different data formats.
STELAR in Conversation With Dragan Boskovic
Before we move on to more specific questions, could you tell me how you got started in the ICT industry and what inspired you to pursue a career in technology and innovation, especially considering its increasing popularity?
I need to go back and look into my long-term memory because it’s been a long time since I started working with data. However, I can identify my first exposure to data back to the late 1980s when I was doing some research work for the Institute for Medicinal Plant Research “Dr Josif Pančić”.
They wanted to determine the optimal harvesting season for some of the plants they were growing and the most efficient extraction method for the desired substances. For this purpose, they had an experimental field near Pančevo in Vojvodina, Serbia. We collected samples from this field on different days and used them to extract the substances. The objective was to identify the most influential variable in the equation. We employed a method called the Latin square, which enabled us to plan our experiments in a structured way, ensuring confidence in the data we have collected and then applying analysis.
It was interesting that back then, it was also the era of personal computers, which were just starting to catch on. So, it was one of the first works that I had done both in data and using one of the IBM PCs, which, although not very powerful, was still very capable of solving this issue. So, it’s not exactly 40 years ago, but almost 40 years ago, I had my first exposure to data science.
Seeing how you truly innovated, especially in that experimental field all the way back then, let's continue with our data-related topics. How the data management principles derived from your experience in the ICT industry can be applied in agriculture to enhance agricultural processes?
I think the data nowadays is very much different to what was back when I started, and we really need to think of data in context of those three Vs. Data has those three attributes – velocity, volume and variety, and that is what makes the main characteristic or the context for modern data management.
You need to be able to handle the data coming at you at very high speeds and at large volumes, and having all kinds of data – some of them being useful and most of them not being useful. So you need to really filter all that real time. I think nowadays it doesn’t feel that we suffer from a lack of data – it is more about which are those applications that make sense in the context of agriculture.
So even in the example that I initially gave, it can be all about improving your yield on the basis of thinking about what field is the best for what crop, what fertiliser or how much fertiliser needs to be applied, and then determining the watering practices or irrigation application. So, the entire lifecycle of the specific crop can be monitored. You can instrument the field and the lab for data collection or use machinery for data collection. Then, gather the data needed for analysis that is relevant to solving your specific application.
I would also add that it’s not only about analysis; it’s also about action or control. You need to have this feedback loop. You need to get the data to gain insights into what is really happening. Then, based on the insights and data analytics, you need to have the logic in place to do certain things in order to accomplish whatever objectives you have.

Before we get into the feedback loop regarding the data, we do often come across the challenge of having data in different formats. This is something that the STELAR project is focusing on - overcoming the problem of large data sets where we have data that is in different formats. How to overcome the challenge of having different data formats that are collected through some modern techniques such as precision farming? Do you have some insights on that?
I think that the problem is that this is man-made, and being man-made also means that it can be solved by us. Nevertheless, all the initial conditions that were in place to make the data formats start as different formats. We need to understand the motivation of different people or different companies to go with proprietary data points. Is it lack of knowledge, or do they not care about interoperability? Do they think they are collecting data for themselves, or is it that they had some engineering constraint and had to optimise the data formats to work with the engineering constraint they have?
The engineering constraint might be “This is how much local memory I have to store the data”, or ”This is the frequency by which I can sample the data and transfer it somewhere else”, or ”This is the data format required by my application that I’m not in charge of”. So there are many different reasons why we end up with different data formats.
In my opinion, those will always exist. So if you think that you are collecting data only for yourself, for your application, or for your company, you really do not want to invest in interoperability or data formats that don’t bring anything in return to you.
What are those incentives that the companies and people should have in order to start to work on interoperability? We can again cite some of them. Now that we are entering the era or age of AI, you would never have enough data. So you cannot ever hope that you would be able to have all the data yourself that you need. You need to make sure that you are able to consume even the data that is not initially produced or gathered by you.
What does it mean – do we need to rely on other people or on data exchanges, and is that then the main incentive for thinking about how to find out the best data format for the specific application? Is it the application, is it the industry, and who is going to dictate the data format?
Again, having seen relatively a lot of engineering initiatives, I can say I was one of the few people initially engaged in standardising digital radio and digital means. I needed data structures, data formats, and the ability to exchange data between different devices. Back then, it was about how to exchange data between Nokia, Motorola, Ericsson, and Alcatel. So you need to agree on specific data structures and data formats. Standards are something that you need to have if you are really interested in pushing interoperability firmly into a specific industry.

I was wondering if it then comes to the fact that we need to see the user as a more broader term, or to keep in mind that the data may need to be reused by a broader group of users that maybe we have initially thought of, especially if we thrive towards an open data space.
I think, again, you can approach this from different aspects. It can be business needs or it can be user needs. Agriculture is an interesting vertical because you would have a business interest, but you would also have a very strong user interest. User interest will be asking for openness and for interoperability, and business is always trying to protect its own investments and IPs, and sometimes it comes at the expense of interoperability.
So, to have the businesses engage and work on interoperability, you need to have either regulations in place – that is, the need for standards – or the business motivation might be that if each of us has different data formats, our market is going to be very fragmented. And if we have a very fragmented market, we are not able to extract full business potential out of it.
On the user side, I think users like to have transparency. So, I would like to know where this food came from and what was the harvesting season. Is the farm using some of the practices that I approve or disapprove of? So, how do you then bring all those interests together into data management practices that will enable this data interoperability? That is very challenging, Julia, and I don’t have a single answer. So, it is both a business push, a regulatory push, and a user push too to have interoperability.
So, it needs to come from different corners of this ecosystem in order to have interoperability in place, and it comes only if everybody has a very clear motivation why they’re doing it.
It's a very good point what you have said about it coming from different points of view, and I'm guessing it's also about balance. It's great that we got into interoperability, so I just wanted to then delve into the relationship between interoperability and compatibility between different data formats when it comes to Agri-Food Systems. So, what are the main points that somebody should pay attention to when it comes to the relation between these two?
I think interoperability and compatibility might not be exactly the same thing, or at least in my mind. So, with interoperability, it is more that basically I can assemble or build the data collection, let’s say equipment, in a way that I can combine different devices, and they know how to talk to each other, exchange the information.
Interoperability mainly comes in this context of having a solution that is working with specific data formats or has been designed for the specific data format, and you can basically unplug one part and plug another part into that system, and it still works. So, when I say part, it means it can be replacing a sensor from manufacturer X with a sensor from manufacturer Z, and still have it work.
When it comes to compatibility, I think it is more on the application level. So, it is more like, “Hey, I have this data, and it is stored in a specific way, it is managed in a specific way. Is my application now compatible with that data engineering structure that can make good use of that thing?” So again, I see interoperability mainly on the device instrumentation of the data collection, data processing.
Compatibility is more on how a specific application can access the data and is it useful or that application what they are getting from that specific structure. And again, we can go back to those 3Vs. Some applications will need to access data very fast. So is your database structured and engineered in a way that my application can get a lot of data very quickly to analyse and do what it’s supposed to do, or is it slow and sluggish, and my application cannot really make use of that data, even though the data seems to be useful on the surface?
Conclusion
This conversation will be released in two parts, so stay tuned for episode 2 where we will delve into other topics. Catch up on episode 1 here and follow our Blog and LinkedIn page.