An in-depth analysis of data reduction methods for sustainable deep learning

Published in Open Research Europe, 2024

In this paper, we present up to eight different methods to reduce the size of a tabular training dataset, and we develop a Python package to apply them. We also introduce a representativeness metric based on topology to measure the similarity between the reduced datasets and the full training dataset. Additionally, we develop a methodology to apply these data reduction methods to image datasets for object detection tasks.

Recommended citation: Perera-Lago J, Toscano-Duran V, Paluzo-Hidalgo E et al. An in-depth analysis of data reduction methods for sustainable deep learning. Open Res Europe 2024, 4:101 (https://doi.org/10.12688/openreseurope.17554.2)
Download Paper