This project investigates the development of reliable and explainable evaluation metrics
for generative models, with a particular focus on applications in game development.
While state-of-the-art approaches such as GANs and diffusion models are capable of
generating increasingly high-quality content, their adoption in production pipelines is
limited by the lack of robust automated evaluation methods. This challenge is especially
pronounced in modalities central to game development, such as animation, sound
effects, and dialogue.
The research is carried out in collaboration between KTH’s Division of Robotics,
Perception and Learning and Electronic Arts’ SEED research group. Its aims are to; first
systematically assess and extend existing metrics across game-relevant domains, second
develop new domain-specific and multimodal metrics aligned with expert evaluation,
and third, to establish benchmarks that integrate automated measures with human ratings.
The resulting methods will provide both academia and industry with rigorous,
transparent, and human-aligned tools for evaluating generative models.
PhD student: Yifan Lu, KTH