1: \begin{proof}
2: Fix $\theta \in \mathcal{P}(E)$.
3: Define the sets \[Q_0 := \{\hat{\mu} \in \mathcal{P}(E) : d_W(\hat{\mu},\theta) \leq r\}\]
4: and for $i = 1,...,n-1$ and $x_1,...,x_i \in E$
5: \[
6: Q_i(x_1,...,x_i) := \{\hat{\mu} \in \mathcal{P}(E) : d_W(\hat{\mu},\pi(x_i)) \leq r\}.
7: \]
8: We note that $M_n(\theta) = Q_0 \otimes Q_1 \otimes ... \otimes Q_{n-1}$, where $Q_0 \otimes Q_1 \otimes ... \otimes Q_{n-1}$ is defined as the set of measures $\mu = K_0 \otimes K_1 \otimes K_2 \otimes ... \otimes K_{n-1} \in \mathcal{P}(E^n)$, where $\mu_0 \in K_0$ and $K_i : E^{i} \rightarrow \mathcal{P}(E)$ are Borel measurable kernels such that $K_i(x_1,...,x_i) \in Q_i(x_1,...,x_i)$ for $\mu$-almost all $x_1,...,x_i$.
9: Since for all $i = 1,...,n$, the set $\{(x_1,...,x_i,\hat{\mu}) \in E^i \times \mathcal{P}(E) : \hat{\mu} \in Q_i(x_1,...,x_i)\}$ is trivially Borel, a measurable selection argument (e.g.~\cite[Prop.~7.50]{bertsekas2004stochastic}) yields for $\nu \in \mathcal{P}(E^n)$
10: \begin{align*}
11: \inf_{\hat{\mu} \in M_n(\theta)} R(\nu,\hat{\mu}) &= \inf_{K_0\otimes ... \otimes K_{n-1} \in Q_0 \otimes ... \otimes Q_{n-1}} \sum_{i=0}^{n-1} \int_{E^n} R(\nu_{i,i+1}(x_1,...,x_i),K_i(x_1,...,x_i)) \nu(dx_1,...,dx_n) \\
12: &\stackrel{(*)}{=} \sum_{i=0}^{n-1} \int_{E^n} \inf_{\hat{\mu} \in Q_i(x_1,...,x_i)} R(\nu_{i,i+1}(x_1,...,x_i),\hat{\mu}) \nu(dx_1,...,dx_n)\\
13: &= \beta(\nu_{0,1},\theta) + \sum_{i=1}^{n-1} \int_{E^n} \beta(\nu_{i,i+1}(x_1,...,x_i),\pi(x_i)) \nu(dx_1,...,dx_n)\\
14: &= \beta_n^{\theta}(\nu),
15: \end{align*}
16: where rigorously step $(*)$ works inductively, see the proofs of \cite[Lemma 4.4]{bartl2016exponential} and \cite[Prop.~5.2]{lacker2016non}.
17: \end{proof}
18: