8f5d390d9511e9d7.tex
1: \begin{abstract}
2:     Phylogenetic trees elucidate evolutionary relationships among species, but phylogenetic inference remains challenging due to the complexity of combining continuous (branch lengths) and discrete parameters (tree topology). 
3:     Traditional Markov Chain Monte Carlo methods face slow convergence and computational burdens. Existing Variational Inference methods, which require pre-generated topologies and typically treat tree structures and branch lengths independently, may overlook critical sequence features, limiting their accuracy and flexibility.
4:     We propose {PhyloGen}, a novel method leveraging a pre-trained genomic language model to generate and optimize phylogenetic trees without dependence on evolutionary models or aligned sequence constraints. {PhyloGen} views phylogenetic inference as a conditionally constrained \textbf{tree structure generation} problem, jointly optimizing tree topology and branch lengths through three core modules: (i) Feature Extraction, (ii) PhyloTree Construction, and (iii) PhyloTree Structure Modeling. 
5:     Meanwhile, we introduce a Scoring Function to guide the model towards a more stable gradient descent.
6:     We demonstrate the effectiveness and robustness of {PhyloGen} on eight real-world benchmark datasets. Visualization results confirm {PhyloGen} provides deeper insights into phylogenetic relationships.
7: \end{abstract}
8: