Knowledge graph Project

Driven by WARA M&L partner SEBx, the Knowledge Graph project is founded on the premise that knowledge graphs present tremendous potential in organizing unstructured, highly heterogeneous data in a structured manner. A knowledge graph can provide a structured framework for data integration, unification, search, and analytics by putting the data in relevant context through interlinked descriptions of concepts, relationships, and events. This is something that could potentially benefit various industries and organizations. But there are challenges. For example, it is both time consuming and expensive to handcraft high quality knowledge graphs. On the other hand, using automated processes to create knowledge graphs may generate useless results. In recent years, Large Language Models (LLMs) have been used to extract structured knowledge from unstructured sources in an automatic process, something that has shown great potential.

During the fall semester of 2025, SEBx and KTH offered a project course on this topic, utilizing the Media AI Hub. The students were divided into two groups: one focusing on LLM-based triplet extraction and the other on synthetic data generation. The course concluded with students submitting reports on their results. These reports are intended to inform a technical white paper. Additionally, open results, including the synthetic datasets, will be made available on the Media AI Hub for public use.