Léo Liberti, Directeur de recherche CNRS et Professeur à l’Ecole polytechnique, viendra dans les locaux de l’IRT SystemX le 18 janvier de 14h à 15h30 (Bâtiment 862, Amphi 34) pour animer un séminaire sur le thème « Distance Geometry in Data Science »
Résumé (en anglais)
Many problems in data science are addressed by mapping entities of various kind to vectors in a Euclidean space of some dimension. Most of these methods (e.g. Multidimensional Scaling, Principal Component Analysis, K-means clustering, random projections) are based on the proximity of pairs of vectors. In order for the results of these methods to make sense when mapped back, the proximity of entities in the original problem must be well approximated in the Euclidean space setting. If proximity were known for each pair of original entities, this mapping would be a good example of isometric embedding. Usually, however, this is not the case, as data are partial, wrong and noisy. I shall survey some of the methods above from the point of view of Distance Geometry.