abstract:52fdd92f48f8428e.tex

1: \begin{abstract}

2: We evaluate AI-assisted generative capabilities

3: on fundamental numerical kernels in high-performance computing (HPC), including AXPY, GEMV, GEMM, SpMV, Jacobi Stencil, and CG. We test the generated kernel codes for a variety of language-supported programming models, including (1) C\texttt{++} (e.g., OpenMP [including offload], OpenACC, Kokkos, SyCL, CUDA, and HIP), (2) Fortran (e.g., OpenMP [including offload] and OpenACC), (3) Python (e.g., numba, Numba, cuPy, and pyCUDA), and (4) Julia (e.g., Threads, CUDA.jl, AMDGPU.jl, and KernelAbstractions.jl).

4: We use the GitHub Copilot capabilities powered by OpenAI Codex available in Visual Studio Code as of April 2023 to generate a vast amount of implementations given simple \texttt{<kernel> + <programming model> + <optional hints>} prompt variants. To quantify and compare the results, we propose a proficiency metric around the initial 10 suggestions given for each prompt.

5: Results suggest that the OpenAI Codex outputs for C\texttt{++} correlate with the adoption and maturity of programming models. For example, OpenMP and CUDA score really high, whereas HIP is still lacking.

6: We found that prompts from either a targeted language such as Fortran or the more general-purpose Python can benefit from adding code keywords, while Julia prompts perform acceptably well for its mature programming models (e.g., Threads and CUDA.jl).

7: We expect for these benchmarks to provide a point of reference for each programming model's community.

8: Overall, understanding the convergence of large language models, AI, and HPC is crucial due to its rapidly evolving nature and how it is redefining human-computer interactions.

9: \end{abstract}

10: