1: \begin{abstract}
2: Phylogenetic trees elucidate evolutionary relationships among species, but phylogenetic inference remains challenging due to the complexity of combining continuous (branch lengths) and discrete parameters (tree topology).
3: Traditional Markov Chain Monte Carlo methods face slow convergence and computational burdens. Existing Variational Inference methods, which require pre-generated topologies and typically treat tree structures and branch lengths independently, may overlook critical sequence features, limiting their accuracy and flexibility.
4: We propose {PhyloGen}, a novel method leveraging a pre-trained genomic language model to generate and optimize phylogenetic trees without dependence on evolutionary models or aligned sequence constraints. {PhyloGen} views phylogenetic inference as a conditionally constrained \textbf{tree structure generation} problem, jointly optimizing tree topology and branch lengths through three core modules: (i) Feature Extraction, (ii) PhyloTree Construction, and (iii) PhyloTree Structure Modeling.
5: Meanwhile, we introduce a Scoring Function to guide the model towards a more stable gradient descent.
6: We demonstrate the effectiveness and robustness of {PhyloGen} on eight real-world benchmark datasets. Visualization results confirm {PhyloGen} provides deeper insights into phylogenetic relationships.
7: \end{abstract}
8: