1: \begin{abstract}
2: In this paper, we are concerned with estimating the joint probability
3: of random variables $X$ and $Y$, given $N$ independent observation blocks
4: $(\zb x^i,\zb y^i)$, $i=1,\ldots,N$,
5: each of $M$ samples
6: $(\zb x^i,\zb y^i) = \big( (x^i_j, y^i_{\sigma^i(j)}) \big)_{j=1}^M$,
7: where $\sigma^i$ denotes an unknown permutation
8: of i.i.d.~sampled pairs
9: $(x^i_j,y_j^i)$, $j=1,\ldots,M$.
10: This means that the internal ordering
11: of the $M$ samples within an observation block is not known.
12: We derive a maximum-likelihood inference functional, propose a computationally tractable approximation and analyze their properties. In particular, we prove a $\Gamma$-convergence
13: result showing that we can recover the true density
14: from empirical approximations as the number $N$ of blocks goes to infinity.
15: Using entropic optimal transport kernels, we model a class of hypothesis spaces of density functions over which the inference functional can be minimized.
16: This hypothesis class is particularly suited for
17: approximate inference of transfer operators from data.
18: We solve the resulting discrete minimization problem by
19: a modification of the EMML algorithm to take addional transition probability
20: constraints into account and prove the convergence of this algorithm.
21: Proof-of-concept examples demonstrate the potential of our method.
22: \end{abstract}
23: