794bcb6788560580.tex
1: \begin{abstract}
2: Recent years have witnessed a surge of interest in integrating high-dimensional data captured by multisource sensors, driven by the impressive success of neural networks in the integration of multimodal data. 
3: However, the integration of heterogeneous multimodal data poses a significant challenge, as confounders introduce unwanted variability and bias, leading to suboptimal performance of multimodal models. 
4: In this paper, we propose a novel batch normalizer using regularization, called RegBN, that effectively addresses the confounding effects and enables efficient training and testing of multimodal models on diverse and heterogeneous data. RegBN leverages the underlying dependencies between different sources of data and enforces independence, thereby facilitating convergence of neural networks and enhancing the learning of multimodal representations. 
5: The proposed method generalizes well across multiple modalities and eliminates the need for learnable parameters, simplifying training and inference. 
6: We demonstrate the effectiveness of RegBN on seven databases from five research areas, encompassing diverse modalities such as language, audio, image, video, tabular, and 3D MRI. RegBN is openly available at --- and welcomes community feedback to further improve multimodal modeling on heterogeneous data.
7: \end{abstract}
8: