Prof. Dragan Boskovic in Spotlight for STELAR
In our second episode of the STELAR project’s Data Stories 360° podcast, Julia Mitrovic, the Dissemination and Communication Manager from the Foodscale Hub, continues with the second part of her conversation with Prof. Dragan Boskovic.
As we have already introduced him in the first episode, he is a Clinical Professor at the W. P. Carey Business School, Arizona State University, but also the Founder of “Fondacija VizLore Labs.”
In this episode, we tackle the evergreen question of data privacy and security. Additionally, we offer some good advice for those in the agricultural business or those who want to start an agricultural business on how to ensure data accuracy and data quality.
Answering Evergreen Question on Data Privacy and Security
I have to ask something that is seen in every data oriented interview, but it is seen for a reason, and it involves the privacy concern of it all. So what measures should we take to address privacy concerns in order to ensure that the data is protected when we implement data management systems in agriculture?
We need to understand who these stakeholders are. We have multiple stakeholders; consumers, producers and different businesses that are involved in the food chain or supply chain. Then we have the regulators themselves – different agencies that are very concerned about the data, whether it is about the food quality or something else. So we need to understand who these stakeholders are and what their primary concerns are.
If I’m collecting data on you, what food you are buying and are you looking to buy the specific brand or are you more interested in a specific agricultural practices or the domain where the food comes from, then I think we have to make sure that we are really protecting you as a user and your privacy.
There are definitely methods and the regulations on how you do that. You need to ensure that any data breaches in that space are not linked to any identities that could point to you as the source of that content.
But if I am a farmer-producer, then I definitely want some identity to be captured because that is my brand. Especially if I think I’m good at producing, let’s say, wine or cheese or whatever, I want you to be able to verify that the product came from my farm, that it was manufactured and produced according to my practices, and so on. I want you to know that it met all the standards imposed by regulators.
So, we need to understand where the information comes from and what the motivation of those stakeholders is. Are they asking for privacy, or are they asking more for openness? All of that needs to be captured within the data engineering that is built for the specific application or solution.
Now that you have mentioned agricultural businesses, I wanted to touch not only on how agricultural businesses see the privacy concerns and data protection, which you have just spoken about, but also to ask if you have some advice for agricultural businesses. For those who want to go into agricultural businesses, how can they ensure data quality and accuracy, especially when we are talking about large data sets? Is this connected to the question of privacy?
Yes, that is a good point. You are basically asking what would be the guiding principles when designing a greenfield or a new solution aimed at the agricultural sector, but starting from a blank sheet of paper.
Again, we need to go back to all those things we discussed: the 3Vs, interoperability versus compatibility, and we also need to understand data privacy versus openness. How should that system be designed in a way that different stakeholders feel good about working and sharing data within that ecosystem?
If we are designing things from scratch, we definitely need to identify what the primary applications for the data are going to be. That’s number one. If I know the applications, I would know what data types or variables I need to collect to make this application useful. This means I need to have all the inputs to get good insights and provide the right answers.
If I know what variables I need to get, I need to ask myself where I can get them from. Is it something I need to get from my farm? Is it something I need to get from the machinery I’m using, or is it something I need to get from public sources that are already available, such as weather data or information on pesticide levels allowed in certain markets? Some data is already there, and we just need to access it.
Once we have a good map of where the data will be coming from, then we can start designing our solution. We also need to ask ourselves if this is going to be a static system or a dynamic one. Am I going to introduce new applications? Maybe I’m just starting with one, but I have a few others in mind. For practical reasons, I might start with one first and scale up later. All this needs to be well thought through before we start designing for scalability.
I would advise that if you need to collect data from your own field, you should find equipment or sensors that produce data in standardised formats useful for a wide range of applications. You should avoid equipment or data formats that lock you in, forcing you to buy everything else from them to make it work.
Think about interoperability and standards when designing your solution. Consider compatibility at the level of your applications because some of the data will come from outside your ecosystem. Your solution needs to know how to access external data and integrate it with the data you are collecting yourself.
Then we come to the privacy versus openness issue. You need to understand the requirements of your applications. Does it require a lot of data to be ingested quickly? What is the needed response time? Do you need to react within seconds or milliseconds for critical operations, or is it something that can wait until the evening or the next day, like opening irrigation valves? You need a good understanding of your application needs and how it manipulates the data to provide useful information.
To summarise: you need a plan. You need to know why you are doing this, where you can access the data, how you can access it, and where you are going in terms of scalability - is it static or dynamic? Then tackle the issue of application needs to balance privacy and openness.
The last thing is to address privacy versus openness. Part of the data needs to be private, part of it open. Your data engineering should reflect this, not just in terms of who can access the data but also how it is protected, whether it is encrypted, and who controls access.
You need specific processes to ensure data security, privacy, and openness, reaching those stakeholders interested in getting the information they need.
Looking into the future, what are the most promising opportunities for innovation in data management within the agricultural industry?
This is very close to my heart because I have been doing research for a long time, and I’m trained to think ahead. We spoke about integrating all those interests and requirements into a solution that balances them to give each stakeholder what they need. With the advances in technology, it is becoming easier to implement solutions that balance these requirements simultaneously.
Having someone ask for openness while another asks for privacy within the same solution was not easy to implement in the past, but going forward, I, as a consumer, would like to see each product I buy have a digital passport.
If I scan a QR code, it should give me all the information about where it came from, what the farming practices were, who the producer was, and everything I might be interested in as a consumer. As a farmer, when I buy new equipment, I need to ensure it collects data on my farming practices because that’s what consumers are asking for.
I need to know if the equipment provides digital data formats that lend themselves to building applications and collecting necessary data. A tractor is just one piece of equipment; many different extensions need to collect data, so I need a digital model of the tractor from the data acquisition perspective.
I need to know how the data flows, what the data formats are, and what additional data collection points can be plugged into that. Let’s call this the digital data acquisition model. The same goes for other equipment like water pumps. I need to know how to control it, what data I need to send, and in what format. Now we come to basically having these data models of the equipment and having a need to collect data real time from the field at certain frequencies in order to build the digital passport of the products. Those three things are what we call the digital twin.
A digital twin is a model of the process updated in real-time with data from machinery or the field, giving information on the product or process. Then how would you protect the data within that flow?
I’m very close to utilising blockchain, having different ledgers, and different encryption methods. How would you share the data between stakeholders who might not be interested in sharing very sensitive data among themselves but still have an interest in sharing some results from the applications? By sharing those data results from the applications, you can further improve the application, which is going to benefit everyone within that ecosystem.
Moving forward, we need to think of products having digital passports, equipment capable of real-time data collection, and digital models of that equipment for data acquisition and actuation. We need to plug this into secure data structures that handle openness, privacy protection, and business sensitivity while providing users with digital passports for full confidence in what he or she is buying.
Conclusion
This wraps up our second episode and our conversation with Professor Dragan Boskovic, but more STELAR episodes are coming soon.
Please make sure to follow our Youtube channel and our Blog, see you soon!