1: \begin{abstract}
2: This paper introduces an efficient and generic framework for finite-element simulations under an implicit time integration scheme.
3: Being compatible with generic constitutive models, a fast matrix assembly method exploits the fact that system matrices are created in a deterministic way as long as the mesh topology remains constant.
4: Using the sparsity pattern of the assembled system brings about significant optimizations on the assembly stage.
5: As a result, developed techniques of GPU-based parallelization can be directly applied with the assembled system.
6: Moreover, an asynchronous Cholesky precondition scheme is used to improve the convergence of the system solver.
7: On this basis, a GPU-based Cholesky preconditioner is developed, significantly reducing the data transfer between the CPU/GPU during the solving stage.
8: We evaluate the performance of our method with different mesh elements and hyperelastic models and compare it with typical approaches on the CPU and the GPU.
9:
10:
11: \end{abstract}