/DiSHACLed

DiSHACLed

Dissecting data workflows using SHACL

Smart industries logo

The DiSHACLed project aims to increase the efficiency of the data intermediaries within the European data ecosystem. DiSHACLed will generate the framework, standards and tooling to replace today’s mostly manual processes to discover and integrate external datasets in a given business or research context with semi-automated algorithms.

Unlocking value with data technologies

The key to establishing semantic interoperability is the use of standardized data models when data are registered. To support the screening of datasets, they can be described by their ‘shape’, i.e. the applied datastructure, through the Shapes Constraints Language (SHACL).

Until now, it was not possible to search for datasets that (partially) adhere to a certain expected minimal shape of data elements and relations.  Given the fast growing number of open data sets, this is a major hurdle to find machine supported, suitable datasets candidates that may enrich a given dataset in a certain business domain and allow to activate the data in the business context in an efficient way. 

Flanders, through initiatives like OSLO (Open Standards for Linking Organisations), has demonstrated strong leadership in semantic interoperability and data governance, establishing over 134 semantic standards that align with European vocabularies. These efforts have positioned Flanders and its Data Sharing Service Providers (DSSPs) as pioneers in using SHACL to define application profiles. Building on this foundation, they aim to turn their expertise into business value, in line with the goals of the EU’s Data Governance Act.

The DiSHACLed project aims to enhance data discovery, tool interoperability, and automated form generation within the European data ecosystem. Aligned with the DGA, the project strengthens Flemish DSSPs by harnessing SHACL to develop scalable, efficient solutions for data governance. By bringing together industry and research partners, DiSHACLed contributes to the broader European data technology ecosystem, advancing the next generation of data governance practices.

Key challenges and research objectives

DiSHACLed focuses on three major research goals related to data governance:

  1. Boosting data discovery
  • Developing algorithms to automate dataset discovery.
  • Improving recall rates in data portals, achieving up to 20 dataset discoveries per second.
  • Piloting implementations in at least two data portals to enhance search efficiency.
  1. Creating interoperable tooling
  • Automating the integration of different data processing tools across multiple vendors.
  • Creating a standardized approach to ensure seamless interoperability between systems.
  • Making at least two distinct data processing tools interoperable as part of the project demonstrator
  1. Enabling automated form generation
  • Enabling the automatic generation of web forms for missing data.
  • Ensuring at least 80% of existing data types can be edited via auto-generated forms.
  • Improving developer efficiency by reducing manual form-building efforts.

Potential applications and impact

DiSHACLed has broad applications across various domains, including:

  • Government data sharing: Enhancing interoperability within the Flanders Smart Data Space.
  • Business intelligence: Streamlining data integration and discovery for enterprises.
  • Smart Cities: Supporting urban data platforms like Urban Sense.
  • Research and academia: Facilitating data discovery and automation for large-scale research projects.

While the project focuses on technological advancements, it ensures compliance with data governance regulations. By promoting data interoperability and automation, DiSHACLed aims to reduce manual effort, increase data accessibility, and enhance trust in data sharing practices.

“DiSHACLed seeks to not only streamline manual processes but also contribute to the European datatech ecosystem by proposing efficient, scalable solutions with wide-reaching implications for data intermediaries, businesses, and citizens.”

DiSHACLed

DiSHACLed addresses critical challenges in data discovery, tool interoperability, and form automation, aiming to set a new standard for efficient data workflows.

DiSHACLed is an imec.icon research project funded by imec and Agentschap Innoveren & Ondernemen (VLAIO).

The project started on 01.03.2025 and is set to run until 30.02.2027.

Project information

Industry

  • Inuits
  • Redpencil
  • Sirus

Research

  • imec – IDLab Data Science Lab – UGent
  • imec – AI&Algorithms

Contact

  • Project lead: Johan Delaure, Redpencil
  • Research lead: Pieter Colpaert, imec – IDLab Data Science Lab – UGent
  • Proposal manager: Pieter Colpaert, imec – IDLab Data Science Lab – UGent
  • Innovation manager: Annelies Vandamme, innovation manager imec.icon