Cargando Eventos

Los invitamos a todos a la charla «Data Quality for Deep Learning» dictada por el Prof. Dr. Saúl Calderón Ramirez.

Día y horario: jueves 2 de marzo, 14hs.
Lugar: aula a confirmar en el pabellón 0+i.

Abstract:
Deep learning models usually need extensive amounts of data, and these data have to be labeled, becoming a concern when dealing with real-world applications. It is known that labeling a dataset is a costly task in time, money, and resource-wise. Different methods exploit small labelled datasets and other types of data with  less costly labelling schemes  (Data augmentation, self-supervised learning, semi supervised learning, etc.). For instance, Semi-supervised Learning Model (SSLM) uses labeled and unlabeled datasets to train a model, improving the overall performance of the models when labeled datasets are small. The unlabeled datasets may include out-of-distribution data with respect to the labeled data, which may affect the model’s accuracy and future predictions. We introduce the importance of data quality metrics, especially when considering that the future of Deep learning models targets real-world applications such as healthcare. Concepts such as data quality metrics has been normally applied in structured data, however, it can also be applied in unstructured data (datasets used to train deep learning models, in different types of learning settings.

Saúl Calderón Ramirez es Ph. D. en Cs. de la Computación  (Universidad De Montfort, Reino Unido), y Magister Scientae en Ingeniería Eléctrica con énfasis en sistemas digitales (Universidad de Costa Rica). Saúl es Profesor en el Instituto Tecnológico de Costa Rica y coordina el PAttern Recognition and MAchine Learning Group (PARMA-Group).