1: \documentclass{article}
2: \usepackage{amsmath,amsfonts,amsthm}
3: \usepackage{graphicx,url,calc}
4:
5: \usepackage[lined,ruled,vlined]{algorithm2e}
6:
7: %\usepackage{algorithmic}
8: \usepackage{multirow}
9: \usepackage{rotating}
10: \usepackage{xspace}
11:
12: \newtheorem{thm}{Theorem}[section]
13: \newtheorem{cnj}[thm]{Conjecture}
14: \newtheorem{lem}[thm]{Lemma}
15: \newtheorem{cor}[thm]{Corollary}
16: \newtheorem{prop}[thm]{Proposition}
17: \newtheorem{defi}[thm]{Definition}
18: \newtheorem{rem}[thm]{Remark}
19: \newtheorem{pty}[thm]{Property}
20:
21: \newcommand{\di}{\displaystyle}
22: \newcommand{\GF}[1]{\ensuremath{\mathtt {GF}(#1)}}
23: \newcommand{\GO}{{\cal O}}
24: \newcommand{\pF}[1]{\leavevmode
25: \kern.1em\raise.0ex \hbox{\Z}\kern-.1em /\kern-.15em\lower.3ex
26: \hbox{#1}\lower.3ex \hbox{\Z}}
27:
28:
29: \newcommand{\Zp}{\leavevmode\kern.1em\raise.0ex \hbox{\ensuremath{\mathbb{Z}}}\kern-.1em /\kern-.15em\lower.3ex\hbox{p}\mbox{\ensuremath{\mathbb{Z}}}\xspace}
30: %\def\Zp{{\ensuremath{{\mathbb{Z}_p}}}}
31: \newcommand{\Z}{\ensuremath{\mathbb{Z}}\xspace}
32: %\def\Zp{{\mathbb{F}_q}}
33:
34: \newcommand{\linbox}{{\sc LinBox}}
35: \newcommand{\MM}{\ensuremath{\text{MM}}\xspace}
36: \newcommand{\TRMM}{\ensuremath{\text{TRMM}}\xspace}
37: %\newcommand{\LSP}{{\ensuremath{\text{LSP}}}}
38: \newcommand{\LQUP}{\ensuremath{\text{LQUP}}\xspace}
39: \newcommand{\TRSM}{\ensuremath{\text{TRSM}}\xspace}
40: \newcommand{\LTL}{\ensuremath{\text{LTL}}\xspace}
41: \newcommand{\UTUT}{\ensuremath{\text{UTUT}}\xspace}
42: \newcommand{\UTLT}{\ensuremath{\text{UTLT}}\xspace}
43: \newcommand{\INVT}{\ensuremath{\text{INVT}}\xspace}
44: \newcommand{\SymAAT}{\ensuremath{\text{SymAAT}}\xspace}
45: \newcommand{\AAT}{\ensuremath{\text{AAT}}\xspace}
46: \newcommand{\ADD}{\ensuremath{\text{ADD}}\xspace}
47: %\newcommand{\lsp}{\LSP}
48: \newcommand{\lup}{\ensuremath{\text{LUP}}\xspace}
49: \newcommand{\lud}{\ensuremath{\text{LUdivine}}\xspace}
50: \newcommand{\lqup}{\ensuremath{\text{LQUP}}\xspace}
51: \newcommand{\turbo}{\ensuremath{\text{TURBO}}\xspace}
52: \newcommand{\fgemm}{\texttt{fgemm}\xspace}
53: \newcommand{\tu}{\turbo}
54: \newcommand{\trsm}{\texttt{trsm}\xspace}
55: \newcommand{\dtrsm}{\texttt{dtrsm}\xspace}
56: \newcommand{\ftrsm}{\texttt{ftrsm}\xspace}
57: %% \newcommand{\ltrsm}{{{\tt ULeft-Trsm}}}
58: %% \newcommand{\ultrsm}{{{\tt ULeft-Trsm}}}
59: %% \newcommand{\lltrsm}{{{\tt LLeft-Trsm}}}
60: %% \newcommand{\urtrsm}{{{\tt URight-Trsm}}}
61: %% \newcommand{\lrtrsm}{{{\tt LRight-Trsm}}}
62: \newcommand{\dbl}{\texttt{double} }
63:
64: \newcommand{\til}{\lower 2pt\hbox{\small${}^\sim$}}
65: %
66:
67: \newcommand{\lCeil}{\left\lceil}
68: \newcommand{\rCeil}{\right\rceil}
69: %
70: \usepackage{fancyhdr}
71: \pagestyle{fancy}
72: %\markboth{Jean-Guillaume Dumas et al.}{Prime Field Linear Algebra}
73: \renewcommand{\leftmark}{J-G. dumas, P. Giorgi, C. Pernet}
74: \renewcommand{\rightmark}{Prime Field Linear Algebra}
75:
76:
77: \title{Dense Linear Algebra over Word-Size Prime Fields: the FFLAS and
78: FFPACK packages\footnote{This material is based on work supported in
79: part by the Institut de Math\'ematiques Appliqu\'ees de Grenoble,
80: project IMAG-AHA. This work was mostly done while the second author was a postdoctoral fellow of the Symbolic Computation Group, D.R. Cheriton School
81: of Computer Science, University of Waterloo, Canada.}}
82: % \footnote{
83: % This material is based on work supported in part by the Institut de
84: % Math\'ematiques Appliqu\'ees de Grenoble, project IMAG-AHA.
85: % This work was mostly done while the second author was a postdoctoral fellow of the Symbolic Computation Group, D.R. Cheriton School
86: % of Computer Science, University of Waterloo, Canada.\\
87: % Author's addresses: J.-G. Dumas, Laboratoire de Mod\'elisation et de Calcul, Universit\'e de Grenoble, 51, rue des Math\'ematiques BP 53 IMAG-LMC,
88: % 38041 Grenoble, France; email: jean-guillaume.dumas@imag.fr,
89: % %P. Giorgi, Symbolic Computation Group, D.R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada; email: pgiorgi@uwaterloo.ca.
90: % P. Giorgi, Laboratoire LP2A, Universit\'e de Perpignan Via Domitia, 52 avenue
91: % Paul Alduy, F-66860 Perpignan Cedex, France. email: pascal.giorgi@univ-perp.fr,
92: % C. Pernet, Dept. of Mathematics, University of Washington, Box 354350 Seattle, WA, 98195-4350, USA.
93: % }
94: %}
95:
96: \urldef\jgdemail\url{Jean-Guillaume.Dumas@imag.fr}
97: \urldef\pgemail\url{Pascal.Giorgi@lirmm.fr}
98: \urldef\cpemail\url{Clement.Pernet@imag.fr}
99:
100: \author{Jean-Guillaume Dumas\footnote{Laboratoire Jean Kuntzmann, umr
101: CNRS 5224, 51, rue des Math\'ematiques BP 53 IMAG-LMC,
102: F38041 Grenoble, France; \jgdemail}\\ Universit\'e de Grenoble
103: \and Pascal Giorgi\footnote{Laboratoire d'Informatique de Robotique
104: et de Microélectronique de Montpellier, umr CNRS 5506; \pgemail}\\ Universit\'e de Montpellier
105: \and Cl\'ement Pernet\footnote{MOAIS (INRIA Rh\^one-Alpes / CNRS LIG
106: Laboratoire d'Informatique de Grenoble); \cpemail}\\ Universit\'e de Grenoble}
107:
108: % \category{G.4}{Mathematical Software}{Algorithm design and analysis}
109: % \category{F.2.1}{Analysis of Algorithms and Problem Complexity}{Numerical Algorithms and Problems}[computations in finite fields.]
110: % \terms{Algorithms, Experimentation, Performance.}
111:
112:
113: \begin{document}
114: \maketitle
115:
116: \begin{center}{\bf Abstract}\\[10pt]
117: %\begin{abstract}
118: \begin{minipage}{\textwidth*5/6}\small
119: In the past two decades, some major efforts have been made to reduce
120: exact (e.g. integer, rational, polynomial) linear algebra problems
121: to matrix multiplication in order to provide algorithms with optimal asymptotic complexity.
122: To provide efficient implementations of such algorithms one need to be careful with the underlying arithmetic.
123: It is well known that modular techniques such as the Chinese remainder algorithm or the $p$-adic lifting allow
124: very good practical performance, especially when word size arithmetic are used.
125: Therefore, finite field arithmetic becomes an important core for efficient exact linear algebra libraries.
126: In this paper, we study high performance implementations of basic linear algebra routines
127: over word size prime fields: specially the matrix multiplication; our goal being to provide an exact alternate to the numerical BLAS library.
128: We show that this is made possible by a careful combination of numerical computations and asymptotically faster algorithms.
129: Our kernel has several symbolic linear algebra applications enabled by diverse
130: matrix multiplication reductions: symbolic triangularization,
131: system solving, determinant and matrix inverse implementations are thus studied.
132: %\end{abstract}
133: \end{minipage}\end{center}
134:
135: {\bf Keywords}: Word size prime fields; BLAS level 1-2-3; {Linear Algebra
136: Package}; Winograd's symbolic Matrix Multiplication; Matrix
137: Factorization; Exact Determinant; Exact Inverse.
138:
139: \newpage
140: {\small
141: \tableofcontents
142: }
143: % Partie 1 : Introduction
144: \input{intro}
145:
146: % Partie 2 : arithmétique et dotprod
147: %\input{dotprod}
148:
149: % Nouvelle Partie 2: preliminaire + arith. corps fini.
150: \input{preliminaries}
151:
152:
153: % Partie 3 : produit matriciel : fflas+wino
154: \input{fflas}
155: %
156: \input{trsm}
157: %
158:
159: % Partie 4 : élimination : ffpack + 5/6 + inv + ...
160: \input{ffpack}
161:
162: %
163: \input{concl}
164:
165: \appendix
166:
167: %\input{appwino_short.tex}
168: \input{winobound}
169:
170: \bibliographystyle{plain}
171: \addcontentsline{toc}{section}{References}
172: \bibliography{jgdbibl}
173:
174: \end{document}
175: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
176: %%% Local Variables:
177: %%% mode: latex
178: %%% TeX-master: "dlaff"
179: %%% End:
180:
181: