f0adea2cf714757e.tex
1: \begin{abstract}
2:     Cryo-Electron Microscopy (cryo-EM) is an increasingly popular experimental technique for estimating the 3D structure of macromolecular complexes such as proteins based on 2D images.
3:     These images are notoriously noisy, and the pose of the structure in each image is unknown \textit{a priori}.
4:     Ab-initio 3D reconstruction from 2D images entails estimating the pose in addition to the structure.
5:     In this work, we propose a new approach to this problem. 
6:     We first adopt a multi-head architecture as a pose encoder to infer multiple plausible poses per-image in an amortized fashion.
7:     This approach mitigates the high uncertainty in pose estimation by encouraging exploration of pose space early in reconstruction.
8:     Once uncertainty is reduced, we refine poses in an auto-decoding fashion. 
9:     In particular, we initialize with the most likely pose and iteratively update it for individual images using stochastic gradient descent (SGD). 
10:     Through evaluation on synthetic datasets, we demonstrate that our method is able to handle multi-modal pose distributions during the amortized inference stage, while the later, more flexible stage of direct pose optimization yields faster and more accurate convergence of poses compared to baselines. 
11:     Finally, on experimental data, we show that our approach is faster than state-of-the-art cryoAI and achieves higher-resolution reconstruction. 
12: \end{abstract}
13: