bd6f378f93cc4bde.tex
1: \begin{abstract}%
2: %  Diffusion condensation is a sequence of multiscale representations of a dataset that aims to encode meaningful abstractions. It is obtained by iteratively applying a diffusion operator to the current representation. Hence, diffusion condensation is a time-inhomogeneous process created by a cascade of operators. In this work, we propose a theoretical analysis of this process. We mainly focus on two topics; the convergence and the evolving geometry of the process. We study the convergence of the process from a geometric and a spectral perspective. In both cases, we define a family of diffusion operators that guarantee the convergence to a single point. Our spectral results are of particular interest since most of the literature is focused on homogeneous processes. To understand the evolution of the geometry of the dataset, we rely on topological data analysis. In particular, we differentiate between an ambient and intrinsic analysis. We define an intrinsic diffusion filtration that arises from the condensation process, and use the intrinsic diffusion homology to summarize the overall topological activity during condensation. We based our ambient analysis on the persistent homology of each representation, hence creating a sequence of persistence diagrams. Furthermore, we present results of both the ambient and intrinsic analysis on toy datasets. Lastly, we show the equivalence between diffusion condensation and instances of hierarchical clustering algorithms. Our work gives theoretical properties of diffusion condensation, as well as strengthening the link between topology and data analysis. \todo{237 words, Max 250}
3: %\end{abstract}
4: