EU Project STELAR starts: Transforming Agricultural Data Practices
Launched with a total budget of € 5.7 million, the EU-funded STELAR project will design, develop, and evaluate an innovative Knowledge Lake Management System (KLMS) to make raw data AI-ready. The Stelar kick-off meeting was held at the end of September in Athens, Greece hosted by the coordinator of the project – the “Athena” Research and Innovation Centre.
STELAR’s ambition is lofty: To follow a human-in-the-loop approach to ensure data of high accuracy and quality. It will do this using real-life use cases, and gathering feedback and expertise. The project will run from 2022-2025 with a consortium of nine partners across five European countries.
STELAR's Knowledge Lake Management System (KLMS)
If data is to be considered an asset, it is to be managed well, and easy to report on. Of course, that’s in an ideal scenario. Data is only as worth as the ease with which it can be used.
But in real life, oftentimes users of data find themselves drowning in a data lake. In technical parlance, a data lake is a system or repository of data stored in its natural/raw format, with original files, including raw copies of source system data, sensor data, social data etc., as well as transformed data used for tasks such as reporting, visualisation, advanced analytics and machine learning.
Over the past years, a large number of national and international open data portals have been established. The EU open data portal lists more than 1.3M datasets, with more than 250K in the agrifood sector alone. However, search capabilities are rather limited, there is a lack of high-quality metadata, limiting its use for AI-related tasks.
A new EU project wants to make swimming in this data lake easier. STELAR which launched with a budget of € 5.7million euros will design, develop, and evaluate an innovative Knowledge Lake Management System (KLMS) to empower users with data that’s findable, accessible, interoperable and reusable, and that it is high quality and reliably labed, hence making it AI-ready.
The project will run from 2022-2025 with a consortium of nine partners across five European countries.
The Data is in the Details
STELAR’s goal is to design a knowledge framework that facilitates data profiling and quality management of raw data to empower users to efficiently and discover the right data in a timely manner for their needs.
In order to do so, STELAR’s knowledge data lake will include (a) additional and more fine-grained metadata automatically extracted from the content (b) various quality indicators, including domain-specific ones and (c) data summaries, which can increase energy efficiency of analytics over large data volumes.
The project’s ambition is to reduce the manual effort required in sifting data, by automating the configuration of workflows, ensuring efficiency and scalability when dealing with large entity collections and to ascertain the robustness of the proposed techniques, in terms of both time efficiency and effectiveness, with respect to various levels of noise.
In practice, the methodology works like this: The project will use real-world use cases in the agrifood data space. Second, the focus will be on agile prototyping, meaning the development cycles will be short and constantly evaluated by experts with ample room for collaboration and flexibility.
STELAR hopes for state-of-the-art scientific outcomes with the use of state-of-the-art techniques and algorithms that will be implemented in all developed tools, leveraging the extensive experience and expertise of the Consortium partners in these areas. Its fourth function is pilot testing and feedback elicitation from domain experts and stakeholders involved.
The aforementioned use-cases are chosen across different functionalities and aspects of the agrifood data space.
The first use case focuses on risk prevention in food supply lines: It will study how advanced data management technologies can be used to enable the generation of integrated, AI-ready datasets for food risk prediction.
To enhance early crop growth predictions, the second use case will combine data from different sources (e.g., operational multispectral and SAR satellite data as well as available hyperspectral data) to provide input for physical modeling and deep learning techniques that derive crop type as well as crop status and current crop growth, toward the goal of predicting crop development as early as at vegetation start in spring.
The third use case is to use data for timely and precision farming interventions — The integration of earth observation data with farming-based technologies that are selected and contextualized according to local specificities to help farmers with valuable data to support land management and crop planning.
These pilots should offer the opportunity to validate and demonstrate outcomes on real scenarios involving users outside the project consortium, enabling feedback from a wider range of stakeholders, and thus avoiding internal bias. STELAR will also make available online demos of tools and open-source code to solicit additional feedback from external stakeholders.
Clean Data, Clear Data
To sum up, the project’s ambition is lofty: To follow a human-in-the-loop approach to ensure data of high accuracy and quality.
On the one hand, the data management tools that will be developed will aim at increasing automation and reducing manual effort. These tools will be designed with domain expertise and background knowledge from multiple stakeholders and experts so it can successfully perform the respective tasks in real-world scenarios and settings.
Follow our Blog and stay connected with us on LinkedIn, Facebook, Instagram, and X.