Archive
/
INF Seminars
/
INF_2022_11_07_Iuri_Macocco
USI - Email
Università
della
Svizzera
italiana
INF
Informatics Seminar
Browser version
Estimating the intrinsic dimension of discrete-metric spaces
Host: Prof. Antonietta Mira
Monday
07.11
USI Campus Est, room A1.05, Sector A // Online on Microsoft Teams
14:30-15:30
Iuri Macocco
International School of Advanced studies (SISSA), Trieste, Italy
You can join online here
Abstract:
Real world-datasets characterized by discrete features are ubiquitous: from categorical surveys to clinical questionnaires, from unweighted networks to DNA sequences. Nevertheless, the most common unsupervised dimensional reduction methods are designed for continuous spaces, and their use for discrete spaces can lead to errors and biases. In the first part of the talk, I'll introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces. I'll show its accuracy on benchmark datasets and apply it to analyze a metagenomic dataset for species fingerprinting, finding a surprisingly small ID -of order 2-, suggesting that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences’ space. In the second part, I'll apply the ID estimator on unweighted networks and show how this information can be used to validate generating models or, possibly, infer their parameters.
Biography:
Iuri Macocco is a 4th year phd student at International School of Advanced studies (SISSA), Trieste, Italy. His research is focused on the analysis and characterization of datasets naturally described by discrete metrics through their intrinsic dimension. He will present his work carried out under the supervision of prof. A. Laio (SISSA) and J. Grilli (ICTP):
https://arxiv.org/abs/2207.09688