9a2be4cbab78ea7d.tex
1: \begin{abstract}
2: 
3: Training Large Language Models (LLMs) efficiently at scale presents a formidable challenge, driven by their ever-increasing computational demands and the need for enhanced performance. In this work, we introduce \texttt{Liger-Kernel}, an open-sourced set of Triton kernels developed specifically for LLM training. With kernel optimization techniques like kernel operation fusing and input chunking, our kernels achieve on average 20\% increase in training throughput and a 60\% reduction in GPU memory for popular LLMs compared with HuggingFace implementations. In addition, \texttt{Liger-Kernel} is designed with modularity, accessibility and adaptability in mind, catering to casual and expert users. Comprehensive benchmarks and integration tests are built-in to ensure compatibility, performance, correctness and convergence across diverse computing environments and model architectures. 
4: The source code is available under a permissive license \href{https://github.com/linkedin/Liger-Kernel}{https://github.com/linkedin/Liger-Kernel}.
5: \end{abstract}
6: