0307:math0307321/CU.tex

1: \documentclass[times,10pt,twocolumn]{article}

2: \usepackage{latex8}

3: \usepackage{times}

4: \usepackage{amsmath,amsthm,amsfonts,amssymb,amscd}

5: \usepackage{fancyhdr}

6: \input epsf

7:

8: \newtheorem{lemma}{Lemma}

9: \newtheorem{theorem}[lemma]{Theorem}

10: \newtheorem{corollary}[lemma]{Corollary}

11: \newtheorem{definition}{Definition}

12: \newtheorem{proposition}[lemma]{Proposition}

13: \newtheorem{question}{Question}

14: \numberwithin{lemma}{section}

15: \numberwithin{definition}{section}

16: \numberwithin{question}{section}

17: \newcommand\C{{\mathbb{C}}}

18: \newcommand\R{{\mathbb{R}}}

19: \newcommand\Quat{{\mathbb{H}}}

20: \newcommand\M{{\mathbb{M}}}

21: \newcommand\F{{\mathbb{F}}}

22: \newcommand\Z{{\mathbb{Z}}}

23: \newcommand\Tr{{\mathop\textup{Tr }}}

24:

25: \setcounter{page}{438}

26: \pagestyle{fancy}

27: \fancyhead{}

28: \renewcommand{\headrulewidth}{0pt}

29: \fancyfoot[CE,CO]{\thepage}

30:

31: \begin{document}

32:

33: \title{A Group-theoretic Approach to Fast Matrix Multiplication}

34:

35: \author{Henry Cohn\\

36: Microsoft Research\\

37: One Microsoft Way\\

38: Redmond, WA 98052-6399\\

39: cohn@microsoft.com\\

40: \and

41: Christopher Umans\\

42: Department of Computer Science\\

43: California Institute of Technology\\

44: Pasadena, CA 91125\\

45: umans@cs.caltech.edu\\

46: }

47:

48: \maketitle

49:

50: \begin{abstract}

51: We develop a new, group-theoretic approach to bounding the

52: exponent of matrix multiplication. There are two components to

53: this approach: (1) identifying groups $G$ that admit a certain

54: type of embedding of matrix multiplication into the group algebra

55: $\C[G]$, and (2) controlling the dimensions of the irreducible

56: representations of such groups. We present machinery and examples

57: to support (1), including a proof that certain families of groups

58: of order $n^{2 + o(1)}$ support $n \times n$ matrix

59: multiplication, a necessary condition for the approach to yield

60: exponent $2$.  Although we cannot yet completely achieve both (1)

61: and (2), we hope that it may be possible, and we suggest potential

62: routes to that result using the constructions in this paper.

63: \end{abstract}

64:

65: \Section{Introduction}

66:

67: \thispagestyle{fancy}

68:

69: \fancyfoot[LE,LO]{\ \\ \parbox{6.875in}{\scriptsize{\ \\ Copyright

70: \copyright\ 2003 IEEE. Reprinted from Proceedings of the 44th

71: Annual Symposium on Foundations of Computer Science. This

72: material is posted here with permission of the IEEE.  Such

73: permission of the IEEE does not in any way imply IEEE endorsement

74: of any of Cornell University's products or services.  Internal or

75: personal use of this material is permitted.  However, permission

76: to reprint/republish this material for advertising or promotional

77: purposes or for creating new collective works for resale or

78: redistribution must be obtained from the IEEE by writing to

79: pubs-permissions@ieee.org.  By choosing to view this document,

80: you agree to all provisions of the copyright laws protecting

81: it.}}}

82:

83: Strassen \cite{S} made the startling discovery that one can

84: multiply two $n \times n$ matrices in only $O(n^{2.81})$ field

85: operations, compared with $2n^3$ for the standard algorithm.  This

86: immediately raises the question of the exponent of matrix

87: multiplication: what is the smallest number $\omega$ such that for

88: each $\varepsilon>0$, matrix multiplication can be carried out in

89: at most $O(n^{\omega+\varepsilon})$ operations? Clearly $\omega

90: \ge 2$.  It is widely believed that $\omega=2$, but the best bound

91: known is $\omega < 2.38$, due to Coppersmith and Winograd

92: \cite{CW}, following a sequence of improvements to Strassen's

93: original algorithm (see \cite[p.~420]{BCS} for the history). It

94: is known that all the standard linear algebra problems (for

95: example, computing determinants, solving systems of equations,

96: inverting matrices, computing LUP decompositions---see Chapter~16

97: of \cite{BCS}) have the same exponent as matrix multiplication,

98: which makes $\omega$ a fundamental number for understanding

99: algorithmic linear algebra. In addition, there are non-algebraic

100: algorithms whose complexity is expressed in terms of $\omega$

101: (see, e.g., Section~16.9 in \cite{BCS}).

102:

103: Several fairly elaborate techniques for bounding $\omega$ are

104: known, but since 1990 nobody has been able to improve on them. In

105: this paper:

106: \begin{itemize}

107: \item We develop a new approach to bounding $\omega$ that imports the

108: problem into the domain of group theory and representation theory.

109: The approach is relatively simple and almost entirely separate

110: {}from the existing machinery built up since Strassen's original

111: algorithm.

112:

113: \item We demonstrate the feasibility of the group theory aspect of the

114: approach by identifying a family of groups for which a parameter

115: that mirrors $\omega$ approaches 2. We also exhibit techniques

116: for bounding this critical parameter and prove non-trivial bounds

117: for a number of diverse groups and group families.

118:

119: \item We pose a question in representation theory (Question~\ref{fundamentalq}

120: below) that represents a potential barrier to directly obtaining

121: non-trivial bounds on $\omega$ using this approach. We do not

122: know the answer to this question. A positive answer would

123: illuminate a path that might lead to $\omega = 2$ using the

124: techniques that we present in this paper.

125: \end{itemize}

126:

127: Our approach is reminiscent of a question asked by Coppersmith

128: and Winograd (in Section~11 of \cite{CW}) about avoiding ``three

129: disjoint equivoluminous subsets'' in abelian groups, which would

130: lead to $\omega=2$ if it has a positive answer. However, our

131: technique is completely different, and our framework seems to

132: have more algebraic structure to make use of (whereas theirs is

133: more combinatorial).

134:

135: \SubSection{Analogy with fast polynomial multiplication}

136:

137: There is a close analogy between the framework we propose in this

138: paper and the well-known algorithm for multiplying two degree $n$

139: polynomials in $O(n\log n)$ operations using the Fast Fourier

140: Transform (FFT). In this section we elucidate this analogy to give

141: a high-level description of our technique.

142:

143: Suppose we wish to multiply the polynomials $A(x) = \sum_{i =

144: 0}^{n-1} a_i x^i$ and $B(x) = \sum_{i = 0}^{n-1} b_i x^i$. The

145: naive way to do this is to compute $n^2$ products of the form

146: $a_ib_j$, and from these the $2n-1$ coefficients of the product

147: polynomial $A(x)\cdot B(x)$. Of course a far better algorithm is

148: possible; we describe it below in language that easily translates

149: into our framework for matrix multiplication.

150:

151: \fancyfoot[CE,CO]{\thepage}

152: \fancyfoot[LE,LO]{}

153:

154: Let $G$ be a group and let $\C[G]$ be the group algebra---that is,

155: every element of $\C[G]$ is a formal sum $\sum_{g \in G} a_g g$

156: with $a_g \in \C$, and the product of two such elements is

157: $$

158: \left(\sum_{g \in G}a_g g \right) \cdot \left ( \vphantom{\sum_{g

159: \in G}a_g g} \sum_{h \in G}b_h h \right ) = \sum_{f \in G}\left

160: (\sum_{gh = f} a_gb_{h} \right ) f.

161: $$

162:

163: We often identify the element $\sum_{g \in G} a_g g$ with the

164: vector of its coefficients. If $G$ is the cyclic group of order

165: $m$, then the product of two elements $a = (a_g)_{g \in G}$ and

166: $b = (b_g)_{g \in G}$ is a {\em cyclic convolution} of the

167: vectors $a$ and $b$. The important observation is that a cyclic

168: convolution is almost what is needed to compute the coefficients

169: of the product polynomial $A(x)\cdot B(x)$---the only problem is

170: that it wraps around.  To avoid this problem, we embed $A(x)$ and

171: $B(x)$ as elements $\bar{A}, \bar{B} \in \C[G]$ as follows: Let

172: $z$ be a generator of $G$, which we assume to be a cyclic group

173: of order $m > 2n-1$, and define

174: \[

175: \bar{A} = \sum_{i = 0}^{n-1}a_iz^i \qquad \textup{and}\qquad

176: \bar{B} = \sum_{i = 0}^{n-1}b_iz^i.

177: \]

178: Since the group size $m$ is large enough to avoid wrapping

179: around, we can read off the coefficients of the product

180: polynomial from the element $\bar{A}\bar{B} \in \C[G]$: the

181: coefficient of $x^i$ in $A(x)B(x)$ is the coefficient of the

182: group element $z^i$ in $\bar{A}\bar{B}$. This is a wordy account

183: of a so-far simple correspondence, but the payoff is near. The

184: {\em Discrete Fourier Transform} (DFT) for $\C[G]$ is an

185: invertible linear transformation $D:\C[G] \rightarrow \C^{|G|}$,

186: which turns multiplication in $\C[G]$ into pointwise

187: multiplication of vectors in $\C^{|G|}$. We can therefore compute

188: the product $\bar{A}\bar{B}$ by first computing $D(\bar{A})$ and

189: $D(\bar{B})$ and then computing the inverse DFT of their

190: pointwise product. Thus, using the $O(m \log m)$ Fast Fourier

191: Transform algorithm, we can perform multiplication in $\C[G]$

192: (and therefore polynomial multiplication, via the embedding

193: above) in $O(m \log m)$ operations.

194:

195: One of the main results of the present paper is that {\em matrix

196: multiplication can be embedded into group algebra multiplication

197: in an analogous way}. The embedding is not as simple as the

198: embedding of polynomial multiplication, but it has a natural and

199: clean description in terms of a property of subsets of $G$ (which

200: we often take to be subgroups). In particular, if $S, T$, and $U$

201: are subsets of $G$ and $A = (a_{s, t})_{s \in S, t \in T}$ and $B

202: = (b_{t, u})_{t \in T, u \in U}$ are $|S| \times |T|$ and $|T|

203: \times |U|$ matrices, respectively, then we define

204: \[\bar{A} = \sum a_{s, t}s^{-1}t \qquad\textup{and}\qquad \bar{B}

205: = \sum b_{t, u} t^{-1}u.\] If $S, T, U$ satisfy the {\em triple

206: product property} (see Definition~\ref{definition:realize}), then

207: we can read off the entries of the product matrix $AB$ from

208: $\bar{A}\bar{B} \in \C[G]$: entry $(AB)_{s, u}$ is simply the

209: coefficient of the group element $s^{-1}u$.

210:

211: In the case of polynomial multiplication, the simplicity of the

212: embedding obscures the fact that if $G$ is too large (e.g., if

213: $|G| = n^2$ rather than $O(n)$), then the benefit of the entire

214: scheme is destroyed. Avoiding this pitfall turns out to be the

215: main challenge in the new setting. We wish to embed matrix

216: multiplication into a group algebra over a {\em small} group $G$,

217: as the size of $G$ is a lower bound on the complexity of

218: multiplication in $\C[G]$. It is not surprising, for example,

219: that $n \times n$ matrix multiplication can be embedded into the

220: group algebra of a group of order $n^3$. We show that abelian

221: groups cannot beat $n^3$ and {\em we identify families of

222: non-abelian groups of size $n^{2 + o(1)}$ that admit such an

223: embedding.}

224:

225: It might seem that this result together with the above trick for

226: performing group algebra multiplication (i.e., taking the DFT,

227: multiplying in the Fourier domain, and transforming back) would

228: imply that $\omega = 2$. There are, however, two complications

229: introduced by the fact that we are forced to work with non-abelian

230: groups. The first is that we know of fast algorithms to compute

231: the DFT only for limited classes of non-abelian groups (see

232: Section~13.5 in \cite{BCS}). However, the DFT is linear, and

233: because of the recursive structure of divide and conquer matrix

234: multiplication algorithms, linear transformations applied before

235: and after the recursive step are ``free.'' For example, in

236: Strassen's original matrix multiplication algorithm, the number

237: of matrix additions and scalar multiplications in the recursive

238: step does not affect the bound on $\omega$. So this potential

239: complication is in fact no problem at all.

240:

241: The second complication is that for $\C[G]$ when $G$ is

242: non-abelian, multiplication in the Fourier domain is {\em not}

243: simply pointwise multiplication of vectors in $\C^{|G|}$. Instead

244: it is {\em block-diagonal matrix multiplication}, where the

245: dimensions of the blocks are the dimensions of the irreducible

246: representations of $G$. We thus obtain a reduction of $n \times n$

247: matrix multiplication to a number of smaller matrix

248: multiplications of varying sizes, which gives rise to an

249: inequality involving the exponent $\omega$ of matrix

250: multiplication. If the size of $G$ were exactly $n^2$, then this

251: inequality would imply that $\omega = 2$. However, the smallest

252: one can make $|G|$ is $n^{2 + o(1)}$, and then the question of

253: whether the inequality implies $\omega = 2$ turns on the

254: representation theory of $G$. We show that when $|G| = n^{2 +

255: o(1)}$, even slight control over the dimension of the largest

256: irreducible representation is sufficient to achieve $\omega = 2$.

257: Some control is necessary to avoid trivialities such as reducing

258: to an even larger matrix multiplication problem. We can achieve

259: that much control; the issue of whether it is possible to achieve

260: more control is the subject of Question~\ref{fundamentalq}.

261:

262: \SubSection{Outline}

263:

264: Following some preliminaries below,

265: Sections~\ref{section:realizing} through~\ref{section:bounds} are

266: devoted to outlining our approach. In

267: Sections~\ref{section:linear} and~\ref{variety}, we show that a

268: variety of different types of groups support matrix multiplication

269: within our framework, and in the process demonstrate a number of

270: useful proof techniques. Section~\ref{section:linear} highlights

271: linear groups, whose representation theory makes them especially

272: attractive for our purposes. Section~\ref{section:Lie} describes

273: a parallel with Lie groups and gives a construction that suggests

274: that finite linear groups may indeed be a fruitful line of

275: inquiry. In Section~\ref{section:wreath} we consider wreath

276: product constructions, and in Section~\ref{section:direct} we use

277: the combinatorial notion of Sperner capacity to demonstrate the

278: surprising fact that the $k$-fold direct product of a group may

279: support $n^k \times n^k$ matrix multiplication even when the group

280: itself fails to support $n \times n$ matrix multiplication. This

281: suggests a potential route to answering

282: Question~\ref{fundamentalq} in the affirmative. We end by

283: mentioning some open problems and variants of our overall

284: approach in Section~\ref{section:conclusions}.

285:

286: \SubSection{Preliminaries}

287:

288: Let $\langle n, m, p \rangle$ denote the structural tensor for

289: rectangular matrix multiplication of $n \times m$ by $m \times p$

290: matrices, and let $R$ denote the tensor rank function. (See

291: \cite{BCS} for background on matrix multiplication and tensors.

292: We will use this material only in the proof of

293: Theorem~\ref{theorem:bound}.)  We will typically work over the

294: field of complex numbers; if we use another field $F$, we will

295: write $\langle n, m, p \rangle_F$. As usual $\omega$ will denote

296: the exponent of matrix multiplication over $\C$.

297:

298: We will use the following basic fact from representation theory:

299: the group algebra $\C[G]$ of a finite group $G$ decomposes as the

300: direct product

301: $$

302: \C[G] \cong  \C^{d_1 \times d_1} \times \dots \times \C^{d_k

303: \times d_k}

304: $$

305: of matrix algebras of orders $d_1,\dots,d_k$.  These numbers are

306: called the character degrees of $G$, or the dimensions of the

307: irreducible representations.  It follows from computing the

308: dimensions of both sides that $|G| = \sum_i d_i^2$.  See

309: \cite{JL} and \cite{H} for background on representation theory.

310:

311: \Section{Realizing matrix multiplication via groups}

312: \label{section:realizing}

313:

314: In this section we describe the embedding of matrix multiplication

315: into group algebra multiplication, and we identify a property of

316: groups $G$ that implies that the group algebra of $G$ admits such

317: an embedding. If $S$ is a subset of a group, let $Q(S)$ denote

318: the right quotient set of $S$, i.e.,

319: $$

320: Q(S) = \{s_1 s_2^{-1} : s_1,s_2 \in S\}.

321: $$

322:

323: \begin{definition}

324: \label{definition:realize} A group $G$ {\em realizes} $\langle

325: n_1,n_2,n_3 \rangle$ if there are subsets $S_1,S_2,S_3 \subseteq

326: G$ such that $|S_i| = n_i$, and for $q_i \in Q(S_i)$, if

327: $$

328: q_1q_2q_3 = 1

329: $$

330: then $q_1=q_2=q_3=1$. We call this condition on $S_1,S_2,S_3$ the

331: {\em triple product property}. If we wish to emphasize the

332: specific subsets, we say that $G$ {\em realizes $\langle

333: n_1,n_2,n_3 \rangle$ through} $S_1,S_2,S_3$.

334: \end{definition}

335:

336: In most of our examples, matrix multiplication will be realized

337: through subgroups $H_1$, $H_2$, $H_3$ of $G$, rather than

338: arbitrary subsets.  In that case, the triple product property is

339: especially simple, because $Q(H_i) = H_i$: it states that if

340: $h_1h_2h_3=1$ with $h_i \in H_i$, then $h_1=h_2=h_3=1$.  An

341: equivalent formulation replaces $h_1h_2h_3=1$ with $h_1h_2=h_3$.

342:

343: Perhaps the simplest example comes from the product $C_{n} \times

344: C_m \times C_p$ of cyclic groups, which clearly realizes $\langle

345: n,m,p \rangle$ through $C_n \times \{1\} \times \{1\}$, $\{1\}

346: \times C_m \times \{1\}$, and $\{1\} \times \{1\} \times C_p$.  We

347: will see a number of less trivial examples shortly.

348:

349: \begin{lemma}

350: \label{lemma:permute} If $G$ realizes $\langle

351: n_1,n_2,n_3\rangle$, then it does so for every permutation of

352: $n_1,n_2,n_3$.

353: \end{lemma}

354:

355: \begin{proof}

356: Suppose $G$ realizes $\langle n_1,n_2,n_3\rangle$ through

357: $S_1,S_2,S_3$, and suppose $s_i,s_i' \in S_i$. We need to show

358: that the order in which $1$, $2$, and $3$ appear in the equation

359: $$

360: s_1's_1^{-1} s_2's_2^{-1} s_3's_3^{-1} = 1

361: $$

362: is irrelevant.  Conjugating by $s_1' s_1^{-1}$ shows that it is

363: equivalent to

364: $$

365: s_2's_2^{-1} s_3's_3^{-1} s_1's_1^{-1} = 1,

366: $$

367: so we can perform a cyclic shift.  To get a transposition, we

368: take the inverse of the initial equation, which yields

369: $$

370: s_3s_3'^{-1} s_2s_2'^{-1} s_1s_1'^{-1} = 1,

371: $$

372: i.e., a transposition of $1$ with $3$ (the roles of $s$ and $s'$

373: have been reversed, but that is irrelevant).  These two

374: permutations generate all permutations of $\{1,2,3\}$.

375: \end{proof}

376:

377: \begin{lemma}

378: \label{lemma:shortexact} If $N$ is a normal subgroup of $G$ that

379: realizes $\langle n_1, n_2, n_3\rangle$ and $G/N$ realizes

380: $\langle m_1, m_2, m_3\rangle$, then $G$ realizes $\langle

381: n_1m_1, n_2m_2, n_3m_3\rangle$.

382: \end{lemma}

383:

384: \begin{proof}

385: Suppose $N$ realizes $\langle n_1, n_2, n_3\rangle$ through

386: $S_1,S_2,S_3$, and suppose $T_1,T_2,T_3$ are lifts to $G$ of the

387: three subsets of $G/N$ that realize $\langle m_1, m_2,

388: m_3\rangle$. Then we claim that $G$ realizes $\langle n_1m_1,

389: n_2m_2, n_3m_3\rangle$ through the pointwise products

390: $S_1T_1,S_2T_2,S_3T_3$.  We need to check that for $s_i,s_i' \in

391: S_i$ and $t_i,t_i' \in T_i$,

392: $$

393: (s_1't_1')(s_1t_1)^{-1} (s_2't_2')(s_2t_2)^{-1}

394: (s_3't_3')(s_3t_3)^{-1} = 1

395: $$

396: iff $s_i=s_i'$ and $t_i=t_i'$ for all $i$.  If we reduce this

397: equation modulo $N$, we find that $t_i=t_i'$ modulo $N$, and

398: hence also in $G$.  The equation in $G$ then becomes

399: $$

400: s_1's_1^{-1} s_2's_2^{-1} s_3's_3^{-1} = 1,

401: $$

402: {}from which we deduce $s_i=s_i'$, as desired.

403: \end{proof}

404:

405: One useful special case of Lemma~\ref{lemma:shortexact} is that if

406: $G_1$ realizes $\langle n_1, m_1, p_1\rangle$ and $G_2$ realizes

407: $\langle n_2, m_2, p_2\rangle$, then $G_1 \times G_2$ realizes

408: $\langle n_1n_2, m_1m_2, p_1p_2\rangle$.

409:

410: Our first theorem describes the embedding of matrix multiplication

411: into group algebra multiplication:

412:

413: \begin{theorem}

414: \label{theorem:reduction} Let $F$ be any field.  If $G$ realizes

415: $\langle n,m,p \rangle$, then the number of field operations

416: required to multiply $n \times m$ with $m \times p$ matrices over

417: $F$ is at most the number of operations required to multiply two

418: elements of $F[G]$.  Furthermore, $\langle n,m,p\rangle_F \le

419: F[G]$.

420: \end{theorem}

421:

422: For the definition of the restriction relation $\le$ in the last

423: sentence, see Section~14.3 of \cite{BCS}.

424:

425: \begin{proof}

426: Let $G$ realize $\langle n,m,p \rangle$ through subsets $S,T,U$.

427: Suppose $A$ is an $n \times m$ matrix, and $B$ is an $m \times p$

428: matrix.  We will index the rows and columns of $A$ with the sets

429: $S$ and $T$, respectively, those of $B$ with $T$ and $U$, and

430: those of $AB$ with $S$ and $U$.

431:

432: Consider the product

433: $$

434: \left(\sum_{s\in S, t \in T} A_{st} s^{-1} t\right)

435: \left(\sum_{t' \in T, u \in U} B_{t'u} t'^{-1} u\right)

436: $$

437: in the group algebra.  We have

438: $$

439: (s^{-1} t) (t'^{-1} u) = s'^{-1} u'

440: $$

441: iff $s=s'$, $t=t'$, and $u=u'$, so the coefficient of $s^{-1} u$

442: in the product is

443: $$

444: \sum_{t \in T} A_{st} B_{tu} = (AB)_{su}.

445: $$

446: Thus, one can simply read off the matrix product from the group

447: algebra product by looking at the coefficients of $s^{-1} u$ with

448: $s \in S, u \in U$, and the assertions in the theorem statement

449: follow.

450: \end{proof}

451:

452: \Section{The pseudo-exponent}

453:

454: The {\em pseudo-exponent} of a group measures the quality of the

455: embedding afforded by Theorem \ref{theorem:reduction} in a single,

456: well-behaved parameter, which in some ways mirrors the exponent

457: $\omega$ of matrix multiplication.

458:

459: \begin{definition}

460: The pseudo-exponent $\alpha(G)$ of a non-trivial finite group $G$

461: is the minimum of

462: $$

463: \frac{3 \log |G|}{\log nmp}

464: $$

465: over all $n,m,p$ (not all $1$) such that $G$ realizes $\langle

466: n,m,p \rangle$.  The pseudo-exponent of the trivial group is $3$.

467: \end{definition}

468:

469: When it is clear from the context which group is intended, we

470: often write $\alpha$ instead of $\alpha(G)$.  Note that in the

471: special case that $G$ realizes $\langle n, n, n \rangle$, its

472: pseudo-exponent satisfies $\alpha \le \log_n |G|$.  In general,

473: if $G$ realizes $\langle n,m,p \rangle$, then

474: $$

475: \alpha \le \log_{\sqrt[3]{nmp}} |G|.

476: $$

477:

478: \begin{lemma}

479: The pseudo-exponent of a finite group $G$ is always greater than

480: $2$ and at most $3$. If $G$ is abelian, then it is exactly $3$.

481: \end{lemma}

482:

483: \begin{proof}

484: The upper bound of $3$ is trivial: use the subgroups $H_1=H_2 =

485: \{1\}$ and $H_3=G$.

486:

487: For the lower bounds, suppose $G$ realizes $\langle n_1,n_2,n_3

488: \rangle$ (with $n_1n_2n_3>1$) through subsets $S_1,S_2,S_3$.  It

489: follows from the definition of realization that the map $(x,y)

490: \mapsto x^{-1} y$ is injective on $S_1 \times S_2$ and its image

491: intersects the quotient set $Q(S_3)$ only in the identity.  Thus,

492: $|G| \ge n_1n_2$, and $|G| > n_1n_2$ unless $n_3=1$.  Similarly,

493: $|G| \ge n_2n_3$ with equality only if $n_1=1$, and $|G| \ge

494: n_1n_3$ with equality only if $n_2=1$. Thus, $|G|^3 >

495: (n_1n_2n_3)^2$, so $\alpha(G) > 2$.

496:

497: If $G$ is abelian, then the product map $S_1 \times S_2 \times S_3

498: \to G$ must be injective, so $|G| \ge n_1n_2n_3$ and $\alpha(G)

499: \ge 3$.

500: \end{proof}

501:

502: The pseudo-exponent is well-behaved with respect to group

503: extensions:

504:

505: \begin{lemma}

506: \label{lemma:upperbound} If $N$ is a normal subgroup of $G$, then

507: $\alpha(G) \le \max(\alpha(N),\alpha(G/N))$.

508: \end{lemma}

509:

510: \begin{proof}

511: Suppose $N$ realizes $\langle n_1, n_2, n_3\rangle$ and $G/N$

512: realizes $\langle m_1, m_2, m_3\rangle$.  Then

513: Lemma~\ref{lemma:shortexact} implies that the pseudo-exponent of

514: $G$ is at most

515: $$

516: \frac{3\log |G|}{\log n_1m_1n_2m_2n_3m_3} = \frac{3\log |N| +

517: 3\log |G/N| }{\log n_1n_2n_3 + \log m_1m_2m_3},

518: $$

519: which is bounded above by the larger of

520: $$

521: \frac{3\log |N|}{\log n_1n_2n_3} \qquad \textup{and}\qquad

522: \frac{3\log |G/N|}{\log m_1m_2m_3},

523: $$

524: as desired.

525: \end{proof}

526:

527: Non-abelian groups can have pseudo-exponent less than $3$.  The

528: smallest example is the symmetric group $S_3$ on $3$ elements. It

529: realizes $\langle 2, 2, 2 \rangle$ through its three subgroups of

530: order $2$, so it has pseudo-exponent at most $\log_2 6$ (and one

531: can check that it is exactly $\log_2 6$). Next, we generalize

532: this construction to show that it is possible to come arbitrarily

533: close to pseudo-exponent $2$, as follows.

534:

535: Given a triangular array of points in the plane, as in

536: Figure~\ref{fig-triangle}, we consider the group of permutations

537: of the points, together with three subgroups, one for each side

538: of the triangle.  Each subgroup permutes the set of points on

539: each line parallel to its side of the triangle.  The proof of

540: Theorem~\ref{theorem:pseudoexponent2}, while not phrased in

541: geometric terms, shows that these subgroups satisfy the triple

542: product property.

543:

544: \begin{figure}

545: \begin{center}

546: \leavevmode \epsfbox[-2 -2 82 92]{triangle.ps}

547: \end{center}

548: \caption{A triangular array of points.} \label{fig-triangle}

549: \end{figure}

550:

551: \begin{theorem}

552: \label{theorem:pseudoexponent2} The pseudo-exponent of

553: $S_{n(n+1)/2}$ is at most

554: $$

555: 2 +\frac{2-\log 2}{\log n} + O\left(\frac{1}{(\log n)^2}\right).

556: $$

557: \end{theorem}

558:

559: \begin{proof}

560: There are $n(n+1)/2$ triples $(a,b,c)$ with $a,b,c \ge 0$ and

561: $a+b+c=n-1$.  We view $S_{n(n+1)/2}$ as the group of permutations

562: of these triples.  Let $H_i$ be the subgroup that fixes the $i$-th

563: coordinate.  The size of this subgroup is $1!2!\dots n!$, so the

564: pseudo-exponent bound is

565: $$

566: \frac{\log (n(n+1)/2)!}{\log 1!2!\dots n!} = 2 + \frac{2-\log

567: 2}{\log n} + O\left(\frac{1}{(\log n)^2}\right),

568: $$

569: assuming these subgroups satisfy the triple product property.  For

570: that, we need to prove that if $h_1h_2h_3=1$ with $h_i \in H_i$,

571: then $h_1=h_2=h_3=1$.

572:

573: Suppose $h_1h_2h_3=1$ with $h_i \in H_i$.   We will order the

574: triples lexicographically, so that $(0,0,n-1)$ is the smallest

575: triple and $(n-1,0,0)$ is the largest, and prove by induction

576: using this ordering that $h_1$, $h_2$, and $h_3$ fix every triple.

577:

578: Suppose all triples smaller than $(a,b,c)$ are fixed by each of

579: $h_1, h_2, h_3$ (in the base case, the set of such triples is

580: empty). The permutation $h_3$ cannot send $(a,b,c)$ to a smaller

581: triple, since all smaller triples are fixed points, so $h_3$ must

582: send it to $(a+i,b-i,c)$ with $i \ge 0$.  Then $h_2$ sends that

583: to $(a+i+j,b-i,c-j)$ for some $j$. The only way $h_1$ can return

584: to $(a,b,c)$ is if $i+j=0$, so that must be the case. However,

585: $h_1$ fixes $(a,b-i,c+i)$ for $i>0$ (since such a triple is

586: smaller than $(a,b,c)$), so we must have $i=0$.  It follows that

587: $(a,b,c)$ is fixed by each of $h_1,h_2,h_3$, so by induction all

588: triples are fixed and hence $h_1=h_2=h_3=1$.

589: \end{proof}

590:

591: The same holds for all symmetric groups, since one can look at the

592: largest subgroup of the form $S_{n(n+1)/2}$.

593:

594: \Section{Relating the pseudo-exponent to $\omega$}

595: \label{section:bounds}

596:

597: In this section we relate the pseudo-exponent $\alpha$ to the

598: exponent of matrix multiplication $\omega$.  As with many of the

599: results since Strassen's algorithm, our main theorems are stated

600: as bounds on $\omega$, rather than explicit algorithms, but of

601: course algorithms are implicit in the proofs.

602:

603: \begin{theorem}

604: \label{theorem:bound} Suppose $G$ has pseudo-exponent $\alpha$,

605: and the character degrees of $G$ are $\{d_i\}$. Then

606: $$

607: |G|^{\omega/\alpha} \le \sum_{i} d_i^{\omega}.

608: $$

609: \end{theorem}

610:

611: The intuition is simple: the problem of multiplying matrices of

612: size $|G|^{1/\alpha}$ reduces to multiplication in $\C[G]$, which

613: is equivalent to multiplying a collection of matrices of sizes

614: $d_i$.  These multiplications should take about $d_i^\omega$

615: operations, so $\sum_i d_i^\omega$ should be an approximate upper

616: bound for the number of operations required to multiply matrices

617: of size $|G|^{1/\alpha}$, i.e., roughly $|G|^{\omega/\alpha}$. It

618: is convenient that when one makes this idea precise, these crude

619: approximations become exact bounds.

620:

621: \begin{proof}

622: Suppose $G$ realizes $\langle n, m, p \rangle$ with $nmp =

623: |G|^{3/\alpha}$ (it follows from the definition of the

624: pseudo-exponent that $G$ realizes such a tensor). By

625: Theorem~\ref{theorem:reduction},

626: \begin{equation}

627: \label{equation:reduction} \langle n, m, p \rangle \le \C[G]

628: \simeq \bigoplus_i \langle d_i, d_i, d_i \rangle.

629: \end{equation}

630: We will need two facts about the rank of matrix multiplication:

631: for all $n',m',p'$,

632: $$

633: (n'm'p')^{\omega/3} \le R(\langle n', m', p' \rangle)

634: $$

635: (Proposition~15.5 in \cite{BCS}), and for each $\varepsilon>0$

636: there exists $C>0$ such that for all $k$,

637: $$

638: R(\langle k, k, k \rangle) \le C k^{\omega+\varepsilon}

639: $$

640: (Proposition~15.1 in \cite{BCS}).

641:

642: The $\ell$-th tensor power of \eqref{equation:reduction} is

643: $$

644: \langle n^\ell, m^\ell, p^\ell \rangle \le

645: \bigoplus_{i_1,\dots,i_\ell} \langle d_{i_1}\dots

646: d_{i_\ell},d_{i_1}\dots d_{i_\ell}, d_{i_1}\dots d_{i_\ell}

647: \rangle,

648: $$

649: if we use

650: $$

651: \langle n_1,m_1,p_1\rangle \otimes \langle n_2,m_2,p_2\rangle

652: \simeq \langle n_1n_2,m_1m_2,p_1p_2 \rangle.

653: $$

654: It follows from taking the rank of both sides that

655: $$

656: |G|^{\ell\omega/\alpha} \le C \left(\sum_i

657: d_i^{\omega+\varepsilon}\right)^\ell,

658: $$

659: and if we take the $\ell$-th root and let $\ell$ go to infinity,

660: then we deduce that

661: $$

662: |G|^{\omega/\alpha} \le \sum_{i} d_i^{\omega+\varepsilon}.

663: $$

664: Finally, because this inequality holds for all $\varepsilon>0$, it

665: must hold for $\varepsilon=0$ as well, by continuity.

666: \end{proof}

667:

668: Notice that if $\alpha(G)$ were $2$, then this theorem would imply

669: that $\omega=2$ (using $\sum_i d_i^2 = |G|$, the Cauchy-Schwarz

670: inequality, and the fact that every non-trivial group has at least

671: two irreducible representations). In general, though, we need to

672: control the character degrees of $G$. The maximum possible

673: character degree for any non-trivial group is $(|G|-1)^{1/2}$; we

674: show below that an upper bound of $|G|^{1/2 - \varepsilon}$ for

675: fixed $\varepsilon > 0$ would be sufficient to obtain $\omega =

676: 2$ from a family of groups with pseudo-exponent approaching $2$

677: (and that even a much weaker bound suffices).

678:

679: We define $\gamma(G)$, or simply $\gamma$ when $G$ is clear from

680: the context, so that $|G|^{1/\gamma}$ is the maximum character

681: degree of $G$ ($\gamma(G)=\infty$ if $G$ is abelian). Ideally,

682: we'd like the exponent of matrix multiplication $\omega$ to be

683: bounded above by the pseudo-exponent $\alpha$. The following

684: corollary shows that in the region near 2, this actually happens,

685: with a correction factor that depends on $\gamma$.

686:

687: \begin{corollary}

688: \label{corollary:upperbound} Let $G$ be a finite group. If

689: $\alpha(G) < \gamma(G)$, then

690: $$

691: \omega \le \alpha \left (\frac{\gamma - 2}{\gamma - \alpha} \right

692: ).

693: $$

694: \end{corollary}

695:

696: \begin{proof}

697: Let $\{d_i\}$ denote the character degrees.  Then by

698: Theorem~\ref{theorem:bound},

699: \begin{eqnarray*}

700: |G|^{\omega/\alpha} &\le& \sum_i d_i^{\omega -2}d_i^2\\

701: &\le& |G|^{(\omega - 2)/\gamma}\sum_i d_i^2\\

702: & = & |G|^{1+(\omega - 2)/\gamma},

703: \end{eqnarray*}

704: which implies $\omega(1/\alpha-1/\gamma) \le 1-2/\gamma.$

705: Dividing by $1/\alpha - 1/\gamma$ (which is positive by

706: assumption) yields the stated result.

707: \end{proof}

708:

709: Like $\alpha(G)$, we have $\gamma(G)>2$ for all $G$, and

710: Corollary~\ref{corollary:upperbound} shows that our approach

711: amounts to a race between $\alpha(G)$ and $\gamma(G)$ to see

712: which approaches $2$ faster. The most attractive form of this

713: corollary is the following special case:

714:

715: \begin{corollary}

716: \label{cor:race} Suppose there exists a family $G_1,G_2,\dots$ of

717: finite groups such that $\alpha(G_i) = 2+o(1)$ as $i \to \infty$,

718: and furthermore $\alpha(G_i)-2 = o(\gamma(G_i)-2)$.  Then the

719: exponent of matrix multiplication is $2$.

720: \end{corollary}

721:

722: These corollaries are weakenings of Theorem~\ref{theorem:bound},

723: the advantage being that they only require knowledge of

724: $\gamma(G)$, which is typically easier to work with than the

725: complete set of character degrees that is required for

726: Theorem~\ref{theorem:bound}.

727:

728: It is reasonable to ask whether the requirement $\alpha < \gamma$

729: which occurs in Corollary~\ref{corollary:upperbound} is

730: necessary. It turns out that it is, because if $\alpha \ge

731: \gamma$, then for all $\omega>0,$

732: $$

733: |G|^{\omega/\alpha} \le |G|^{\omega/\gamma} \le \sum_i d_i^\omega,

734: $$

735: where the second inequality holds because $|G|^{1/\gamma} = d_i$

736: for some $i$. Then the inequality in Theorem~\ref{theorem:bound}

737: holds even for $\omega = 3$. The necessity of $\alpha < \gamma$

738: makes perfect sense, because when it fails to hold, the approach

739: amounts to a reduction of matrix multiplication to several

740: instances, one of which is as large as the original instance. In

741: fact, the construction in the proof of

742: Theorem~\ref{theorem:pseudoexponent2} succumbs to this problem:

743: there we proved that $\alpha(S_{n(n+1)/2}) \le 2+O(1/\log n)$,

744: but it turns out that $\gamma(S_{n(n+1)/2}) = 2 + \Theta(1/(n\log

745: n))$ (see \cite{VK}).  However, there exist non-abelian groups

746: for which $\alpha < \gamma$ and $\alpha < 3$; one example is the

747: group in Proposition~\ref{proposition:order80} below.

748:

749: If we {\em do} have access to the complete set of character

750: degrees then there is a relatively simple condition to check to

751: determine whether the inequality in Theorem~\ref{theorem:bound}

752: yields a non-trivial bound on $\omega$. The condition is that

753: $|G|^{3/\alpha} > \sum_i{d_i^3}$. To see this observe that the

754: inequality in Theorem~\ref{theorem:bound} is equivalent to

755: \begin{equation} \label{eqn:log}

756: \frac{\omega}{\alpha} \log |G| \le \log \sum_i d_i^\omega.

757: \end{equation}

758: The right-hand side is convex as a function of $\omega$, and the

759: left-hand side is linear. Furthermore, as $\omega \to \infty$,

760: the right-hand side is asymptotic to

761: $$

762: \frac{\omega}{\gamma} \log |G|,

763: $$

764: which is smaller than the left-hand side when $\alpha < \gamma$

765: (which is the non-trivial case).  Therefore \eqref{eqn:log} gives

766: no information about $\omega$ in the interval $[2,3]$ unless it

767: rules out $\omega=3$, which is equivalent to the above stated

768: condition. We do not have examples of groups meeting this

769: condition.

770:

771: We are thus led to pose the following question in representation

772: theory:

773:

774: \begin{question}

775: \label{fundamentalq} Does there exist a finite group that realizes

776: $\langle n,m,p \rangle$ and has character degrees $\{d_i\}$ such

777: that

778: $$

779: nmp > \sum_i d_i^3?

780: $$

781: \end{question}

782:

783: It is possible that there is a theorem in representation theory

784: that implies that the answer to this question is ``no.'' In that

785: case the approach we have outlined cannot be used directly to

786: obtain bounds on $\omega$; however even in this case there are

787: variants of our approach that would not be ruled out (see, e.g.,

788: Subsection~\ref{section:extensions}). On the other hand, a

789: positive answer might point the direction to a proof that $\omega

790: = 2$ using our approach: it would seem strange if the best bound

791: groups could prove were some constant strictly between~$2$

792: and~$3$, and the condition in Corollary~\ref{cor:race} for

793: $\omega=2$ feels very natural.

794:

795: \Section{Linear groups} \label{section:linear}

796:

797: Matrix groups over finite fields are an important class of finite

798: groups. They are especially attractive for our purposes because

799: their representation sizes, as measured by $\gamma$, are well

800: behaved. We will focus on the case of $SL_n(\F_q)$ for simplicity,

801: although we see no reason why it should perform better than other

802: linear groups.  If $n>1$ is held fixed, $\gamma(SL_n(\F_q))$

803: approaches $2+2/n$ as $q$ tends to infinity (which can be deduced

804: from \cite{Green}, according to a private

805: communication from G.\ Lusztig). Thus, if one could prove that

806: $\alpha(SL_n(\F_q)) = 2+o(1)$ for some fixed $n$, then

807: Corollary~\ref{corollary:upperbound} would imply $\omega=2$. Even

808: if one lets $n$ grow, one might still hope that $\alpha$ would

809: tend to $2$ faster than $\gamma$.  We cannot prove that $\alpha$

810: even approaches $2$ at all as $n,q \to \infty$, but comparison

811: with Theorem~\ref{theorem:slnLie} below suggests that it does. In

812: this section we concentrate on the case of $SL_2(\F_q)$.

813:

814: For later reference, we collect here the character degrees of

815: $SL_2(\F_q)$:

816: \begin{table}[h]

817: \begin{tabular}{c|c|c}

818: Degree & Multiplicity ($q$ odd) & Multiplicity ($q$ even)\\ \hline

819: $q+1$ & $(q-3)/2$ & $(q-2)/2$\\

820: $q$ & $1$ & $1$\\

821: $q-1$ & $(q-1)/2$ & $q/2$\\

822: $(q+1)/2$ & $2$ & $0$\\

823: $(q-1)/2$ & $2$ & $0$\\

824: $1$ & $1$ & $1$

825: \end{tabular}

826: \end{table}

827:

828: (See Exercise~28.2 and its solution in \cite{JL} for $q$ even, and

829: \cite{LR} for $q$ odd, but note that \cite{LR} has a typo in the

830: multiplicity for degree $q+1$ at the bottom of the first column

831: on page~122.)

832:

833: \begin{proposition}

834: \label{proposition:sl2} The group $SL_2(\F_q)$ of order $q^3-q$

835: realizes $\langle q, q, q \rangle$.

836: \end{proposition}

837:

838: Unfortunately, this pseudo-exponent bound tends to $3$ as $q \to

839: \infty$, but at least it is always strictly better than $3$.  (We

840: can also prove similarly that $\alpha(SL_n(\F_q)) < 3$.)

841:

842: \begin{proof}

843: Consider the three parabolic subgroups

844: $$

845: H_1 = \left\{ \left(

846: \begin{array}{cc}

847: 1 & x\\

848: 0 & 1 \end{array} \right) : x \in \F_q\right\},

849: $$

850: $$

851: H_2 = \left\{ \left(

852: \begin{array}{cc}

853: 1 & 0\\

854: y & 1 \end{array} \right) : y \in \F_q\right\},

855: $$

856: and

857: $$

858: H_3 = \left\{ \left(

859: \begin{array}{cc}

860: 1+z & z\\

861: -z & 1-z \end{array} \right) : z \in \F_q\right\}.

862: $$

863: We need to check that for $h_i \in H_i$, if $h_1h_2=h_3$, then

864: $h_1=h_2=h_3=1$. To check that, we multiply to get

865: $$

866: \left(

867: \begin{array}{cc} 1 & x\\ 0 & 1

868: \end{array} \right)\left( \begin{array}{cc} 1 & 0\\ y & 1

869: \end{array} \right) = \left( \begin{array}{cc}

870: 1+xy & x\\

871: y & 1 \end{array} \right).

872: $$

873: That can be of the form

874: $$

875: \left(

876: \begin{array}{cc}

877: 1+z & z\\

878: -z & 1-z \end{array} \right)

879: $$

880: only if $x=y=z=0$, as desired.

881: \end{proof}

882:

883: One might hope that $SL_n(\F_q)$ realizes

884: $$

885: \langle q^{n(n-1)/2}, q^{n(n-1)/2}, q^{n(n-1)/2}\rangle

886: $$

887: through three conjugates of the group of upper-triangular matrices

888: with $1$'s on the diagonal. However, that fails for $q=2$ and

889: $n=3$, according to calculations using the computer program GAP

890: (see~\cite{GAP}); furthermore, no subgroups of these orders work

891: for $q=2$ and $n=3$.

892:

893: \begin{proposition}

894: \label{proposition:sl2fq2} The group $SL_2(\F_{q^2})$ of order

895: $q^6-q^2$ realizes $\langle q^2, q^2, q^3-q \rangle$.

896: \end{proposition}

897:

898: \begin{proof}

899: Let $x \mapsto \bar x$ denote the Frobenius automorphism of

900: $\F_{q^2}$ over $\F_q$.  The three subgroups we will use are

901: $$

902: H_1 = \left\{ \left(

903: \begin{array}{cc}

904: 1 & x\\

905: 0 & 1 \end{array} \right) : x \in \F_{q^2}\right\},

906: $$

907: $$

908: H_2 = \left\{ \left(

909: \begin{array}{cc}

910: 1 & 0\\

911: y & 1 \end{array} \right) : y \in \F_{q^2}\right\},

912: $$

913: and

914: \begin{eqnarray*}

915: H_3 &=& SU_2(\F_q)\\

916: &=& \left\{ \left(

917: \begin{array}{cc}

918: a & b\\

919: -\bar b & \bar a \end{array} \right) : a,b \in \F_{q^2}, a\bar a

920: + b\bar b = 1\right\}.

921: \end{eqnarray*}

922: Note that to check that $|H_3| = q^3-q$, one just needs to count

923: solutions to $a\bar a + b\bar b = 1$. For a fixed $b$ with $b\bar

924: b \ne 1$, there are $q+1$ corresponding choices of $a$ that work;

925: if $b\bar b =1$, then $a=0$.  There are $(q^2-1)-(q+1)$ non-zero

926: choices of $b$ with $b \bar b \ne 1$ (to which we must add

927: $b=0$), and $q+1$ with $b\bar b = 1$.  Thus, there are

928: $(q^2-q-1)(q+1)+(q+1) = q^3-q$ elements of $H_3$.

929:

930: As in the previous proof, checking the triple product property

931: amounts to checking that

932: $$

933: \left( \begin{array}{cc}

934: 1+xy & x\\

935: y & 1 \end{array} \right) = \left(

936: \begin{array}{cc}

937: a & b\\

938: -\bar b & \bar a \end{array} \right)

939: $$

940: implies $x=y=b=0$ and $a=1$, which is a trivial calculation.

941: \end{proof}

942:

943: Proposition~\ref{proposition:sl2fq2} proves that

944: $$

945: \liminf_{q \to \infty} \alpha(SL_2(\F_q)) \le 18/7,

946: $$

947: which is substantially better than~$3$ but still not near $2$.

948: Using Theorem~\ref{theorem:bound} and the character degrees of

949: $SL_2(\F_q)$, one can show that if

950: $$

951: \liminf_{q \to \infty} \alpha(SL_2(\F_q)) < 9/4,

952: $$

953: then Question~\ref{fundamentalq} has a positive answer.

954:

955: \Section{Lie groups} \label{section:Lie}

956:

957: In the category of Lie groups, one can set up a theory parallel to

958: that of the previous sections. We do not know how to use it to

959: bound the exponent of matrix multiplication (because of course

960: Lie groups of positive dimension are infinite). However, we have

961: had more luck constructing examples using Lie groups than with

962: finite linear groups, and this success seems a good reason to be

963: optimistic about matrix groups over finite fields. All examples

964: involving Lie groups can be skipped by a reader who cares only

965: about finite groups and matrix multiplication.

966:

967: Recall that $Q(S)$ denotes the right quotient set of $S$.

968:

969: \begin{definition}

970: \label{definition:Liepseudoexponent}  Let $G$ be a Lie group,

971: with submanifolds $M_1, M_2, M_3$ such that for $q_i \in Q(M_i)$,

972: if $q_1q_2q_3=1$ then $q_1=q_2=q_3=1$. We say that $G$ has {\em

973: Lie pseudo-exponent} at most

974: $$

975: \frac{\dim(G)}{(\dim(M_1)+\dim(M_2)+\dim(M_3))/3}.

976: $$

977: \end{definition}

978:

979: We usually take the submanifolds to be Lie subgroups.  If $G$ and

980: the three subgroups are algebraic groups defined over a number

981: field, then it is natural to ask what pseudo-exponent may be

982: achieved when one reduces modulo a prime ideal, to get a finite

983: quotient group. If the triple product property still holds, then

984: as the finite field size tends to infinity, the pseudo-exponent

985: bound of this finite group approaches the Lie pseudo-exponent.

986: However, the triple product property may not be preserved, as we

987: will show after the following theorem.

988:

989: \begin{theorem}

990: \label{theorem:slnLie} The group $SL_n(\R)$ has Lie

991: pseudo-exponent at most $2+2/n$.

992: \end{theorem}

993:

994: \begin{proof}

995: The three subgroups are the group $U$ of upper-triangular

996: matrices with $1$'s on the diagonal, the group $L$ of

997: lower-triangular matrices with $1$'s on the diagonal, and

998: $SO_n(\R)$.  Each subgroup has dimension $n(n-1)/2$, and

999: $SL_n(\R)$ has dimension $n^2-1$, so assuming the triple product

1000: property holds, the Lie pseudo-exponent is at most

1001: $$

1002: \frac{n^2-1}{n(n-1)/2} = 2+\frac{2}{n}.

1003: $$

1004:

1005: Let $M \in SO_n(\R)$, $A \in U$, and $B \in L$.  We wish to prove

1006: that if $MA=B$, then $M=A=B=I$.  Let $e_1,\dots,e_n$ be the

1007: standard basis of $\R^n$.  We will prove by induction on $i$ that

1008: $Me_i=e_i$.  Once we know that $M=I$, it follows that $A=B$, and

1009: thus $A=B=I$ because $U$ and $L$ are disjoint except for the

1010: identity.  ($A=B=I$ will also follow directly from the proof that

1011: $M=I$.)

1012:

1013: Let $A_i$ and $B_i$ denote the $i$-th columns of $A$ and $B$, and

1014: denote their $j$-th entries by $A_{ij}$ and $B_{ij}$.  Note that

1015: this indexing of rows and columns is opposite to the standard

1016: convention, but it will be more convenient in this proof.

1017: Because $MA=B$, we have

1018: $$

1019: M A_i = B_i.

1020: $$

1021:

1022: We start with the base case $i=1$.  Since $A$ is in $U$, we have

1023: $A_1 = e_1$.  Thus, $|B_1| = |MA_1| = |Me_1| = |e_1| = 1$, since

1024: $M$ is an orthogonal matrix.  Because $B_{11} = 1$, the only way

1025: $|B_1|$ can be $1$ is if $B_1=e_1$.  Thus, $Me_1 = e_1$.

1026:

1027: Now suppose that $Me_j = e_j$ for all $j<i$.  Because $A$ is in

1028: $U$,

1029: $$

1030: A_i = e_i + \sum_{j < i} A_{ij} e_j,

1031: $$

1032: and because $B$ is in $L$,

1033: $$

1034: B_i = e_i + \sum_{j > i} B_{ij} e_j.

1035: $$

1036: Now the induction hypothesis implies that

1037: $$

1038: B_i = MA_i = Me_i + \sum_{j < i} A_{ij} e_j,

1039: $$

1040: so

1041: $$

1042: Me_i = e_i +\sum_{j > i} B_{ij} e_j - \sum_{j < i} A_{ij}e_j.

1043: $$

1044: Since $M$ is orthogonal, $|Me_i|=|e_i|=1$.  The coefficient of

1045: $e_i$ in $Me_i$ is already $1$, so the other coefficients must be

1046: zero and thus $Me_i=e_i$, as desired.

1047: \end{proof}

1048:

1049: The same holds for $SL_n(\C)$ with $SO_n(\R)$ replaced by $SU_n$,

1050: but not by $SO_n(\C)$: the orthogonal matrix

1051: $$

1052: \left(\begin{array}{ccc} 1 & \frac{-1+i}{2} & \frac{1+i}{2}\\

1053: 1 & \frac{1+i}{2} & \frac{-1+i}{2}\\

1054: -i & 1 & 1

1055: \end{array}\right)

1056: $$

1057: equals

1058: $$

1059: \left(\begin{array}{ccc} 1 & 0 & 0\\

1060: 1 & 1 & 0\\

1061: -i & \frac{1-i}{2} & 1

1062: \end{array}\right)

1063: \left(\begin{array}{ccc} 1 & \frac{-1+i}{2} & \frac{1+i}{2}\\

1064: 0 & 1 & -1\\

1065: 0 & 0 & 1

1066: \end{array}\right).

1067: $$

1068: Of course the same obstacle arises over finite fields (a sum of

1069: non-zero squares may vanish).

1070:

1071: \Section{Additional examples} \label{variety}

1072:

1073: In this section we explore a variety of different types of groups,

1074: and prove non-trivial pseudo-exponent bounds for them. We hope

1075: that these examples (together with the ones we have already seen)

1076: will serve as something of a tool kit for constructing a group

1077: that might answer Question~\ref{fundamentalq}, and possibly even a

1078: family of groups that prove $\omega = 2$.

1079:

1080: \SubSection{Solvable groups} \label{section:solvable}

1081:

1082: Non-abelian simple (or almost simple) groups appear to be a

1083: fruitful source of groups with small pseudo-exponents.  However,

1084: solvable groups also do quite well. In this section, we will

1085: construct solvable groups that have Lie pseudo-exponent tending

1086: to $2$, and finite solvable groups with pseudo-exponent bounds of

1087: $2.5$ and $2.4811\dots$ (which, GAP tells us, is the best

1088: pseudo-exponent attained using three subgroups in any group of

1089: order up to~$100$).

1090:

1091: Let $F$ be a field, and $\langle,\rangle$ a symmetric bilinear

1092: form on $F^n$.  Define multiplication in

1093: $$

1094: G = \{(x,y,\alpha): x,y \in F^n, \alpha \in F \}

1095: $$

1096: via

1097: $$

1098: (x,y,\alpha) (u,v,\beta) = (x+u,y+v,\alpha+\beta+2\langle u, y

1099: \rangle),

1100: $$

1101: and define the three subgroups

1102: $$

1103: H_1 = \{(x,0,0): x \in F^n\},

1104: $$

1105: $$

1106: H_2 = \{(0,y,0): y \in F^n\},

1107: $$

1108: and

1109: $$

1110: H_3 = \{(z,z,\langle z,z \rangle): z \in F^n\}.

1111: $$

1112:

1113: \begin{proposition}

1114: If the only element $z \in F^n$ satisfying $\langle z,z \rangle =

1115: 0$ is $z=0$, then $H_1$, $H_2$, and $H_3$ satisfy the triple

1116: product property. \label{prop:inner}

1117: \end{proposition}

1118:

1119: \begin{proof}

1120: We simply need to check that $H_3$ avoids all elements of the form

1121: $(x,0,0)(0,y,0) = (x,y,0)$, except when $x=y=0$.  The only way

1122: such an element can be in $H_3$, i.e., of the form $(z,z,\langle

1123: z,z \rangle)$, is if $x=y=z$ and $\langle z,z \rangle = 0$.  That

1124: means $z=0$ and thus $x=y=0$, as desired.

1125: \end{proof}

1126:

1127: When $F = \R$, the group described above is a Heisenberg group,

1128: and we obtain the following bound:

1129:

1130: \begin{corollary}

1131: In the above framework, with $F = \R$, and $\langle,\rangle$ the

1132: standard inner product, the Lie group $G$ has Lie pseudo-exponent

1133: at most $2 + 1/n$.

1134: \end{corollary}

1135:

1136: \begin{proof}

1137: It is clear that Proposition \ref{prop:inner} is satisfied; the

1138: group dimension is $2n + 1$, and the three subgroups each have

1139: dimension $n$.

1140: \end{proof}

1141:

1142: When $F$ is a finite field, the group described above is an

1143: extraspecial group, and we obtain the following bound:

1144:

1145: \begin{corollary}

1146: In the above framework, with $F = \F_q$ of odd characteristic, $n

1147: = 2$, and $\langle x, y \rangle = x_1y_1 - wx_2y_2$ for some $w

1148: \in F$ that is not a square, the finite group $G$ has

1149: pseudo-exponent at most $2.5$.

1150: \end{corollary}

1151:

1152: Here, $x_i$ denotes the $i$-th coordinate of the vector $x$.

1153:

1154: \begin{proof}

1155: Note that $\langle z, z \rangle = 0$ implies $z_1^2 = w z_2^2$,

1156: which by our choice of $w$ can only happen when $z = 0$. Thus

1157: Proposition \ref{prop:inner} is satisfied. The group has order

1158: $q^5$, and the three subgroups have size $q^2$, leading to a

1159: pseudo-exponent bound of $2.5$ as claimed.

1160: \end{proof}

1161:

1162: A slight variant of this construction works for even $q$ as well,

1163: but the pseudo-exponent bound is identical so we omit the details.

1164:

1165: One quite different example is the following Frobenius group of

1166: order $80$.  We found the group by a brute force search using GAP,

1167: and Michael Aschbacher supplied the following humanly

1168: understandable proof that it works.

1169:

1170: Let $C_5 \subset \F_{16}^\times$ be the unique subgroup of order

1171: $5$. Consider its semidirect product $G = C_5 \ltimes \F_{16}$

1172: with the additive group of $\F_{16}$, where multiplication is

1173: defined by

1174: $$

1175: (\alpha,x)(\beta,y) = (\alpha\beta, \beta x + y).

1176: $$

1177:

1178: \begin{proposition}

1179: \label{proposition:order80} The group $G = C_5 \ltimes \F_{16}$

1180: realizes $\langle 5, 5, 8\rangle$, and thus $\alpha(G) \le

1181: 3\log_{200}80 = 2.4811\dots$.

1182: \end{proposition}

1183:

1184: \begin{proof}

1185: Let

1186: $$

1187: H_1 = \{(\alpha,0) : \alpha \in C_5\}

1188: $$

1189: and

1190: $$

1191: H_2 = \{(\alpha, \alpha-1) : \alpha \in C_5\}

1192: $$

1193: (i.e., $H_2$ is $H_1$ conjugated by $(1,1)$). Let

1194: $$

1195: H_3 = \{(1,x) : x \in \F_{16}, \Tr x = 0 \},

1196: $$

1197: where $\textup{Tr}$ denotes the trace from $\F_{16}$ to $\F_2$.

1198: These groups satisfy $|H_1| = |H_2| = 5$ and $|H_3| = 8$.  All we

1199: need to check is the triple product property.

1200:

1201: We must verify that unless $\alpha$ and $\beta$ are both $1$, the

1202: product

1203: $$

1204: (\alpha,0)(\beta, \beta-1) = (\alpha\beta, \beta-1)

1205: $$

1206: is not in $H_3$.  For it to be in $H_3$, we must have $\alpha =

1207: \beta^{-1}$ and $\Tr (\beta-1) = 0$.  However,

1208: $$

1209: \Tr (\beta - 1) = \Tr \beta - \Tr 1 = \Tr \beta,

1210: $$

1211: and $\Tr \beta = 1$ for $\beta \in C\setminus\{1\}$ because the

1212: minimal polynomial over $\F_2$ of such a $\beta$ is

1213: $1+\beta+\beta^2+\beta^3+\beta^4$.

1214: \end{proof}

1215:

1216: This proposition generalizes as follows (see \cite{Brown} for

1217: background on cohomology): Let $G$ be a group that acts on an

1218: abelian group $A$, $\theta : G \to A$ a $1$-cocycle, and $B

1219: \subseteq A$ a subgroup. If $\theta(g) \in B$ implies $g=1$ for

1220: all $g \in G$, then the semidirect product $G \ltimes A$ realizes

1221: $\langle |G|,|G|,|B|\rangle$ via the subgroups $G \times \{0\}$,

1222: $\{(g,\theta(g)) : g \in G\}$, and $\{1\} \times B$.  (In

1223: Proposition~\ref{proposition:order80}, the $1$-cocycle is a

1224: coboundary.) Unfortunately, we do not know any other good

1225: examples.

1226:

1227: Unlike the cases of extraspecial groups and matrix groups, we do

1228: not know how to generalize Proposition~\ref{proposition:order80}

1229: to achieve Lie pseudo-exponent arbitrarily near $2$. The best we

1230: know how to do is the following. Let $\Quat$ be the quaternions,

1231: and $U \subset \Quat^\times$ be the group of unit quaternions

1232: (which is isomorphic to $SU(2)$). Then within the semidirect

1233: product $U \ltimes \Quat$, the three subgroups $U \times \{0\}$,

1234: $\{(u,u-1) : u \in U\}$, and $\{(0,x): \Tr x = 0 \}$ satisfy the

1235: triple product property and prove that the Lie pseudo-exponent of

1236: $U \ltimes \Quat$ is at most $7/3$.

1237:

1238: \SubSection{Wreath products} \label{section:wreath}

1239:

1240: In this section we present another family of groups that achieves

1241: pseudo-exponent $2 + o(1)$. This family is described in terms of

1242: the wreath product: if $A$ is a group, then the wreath product $A

1243: \wr S_n$ is the semidirect product $S_n \ltimes A^n$, where $S_n$

1244: acts on $A^n$ by permuting the coordinates (and the

1245: multiplication is of course via $(\pi, u)(\pi',v) = (\pi\pi',

1246: \pi'u + v)$).

1247:

1248: \begin{theorem}

1249: Let $A$ be the cyclic group of order $2n$, and let $G_n = A\wr

1250: S_n$. Then

1251: $$

1252: \alpha(G_n) \le \gamma(G_n) = 2 + \frac{1+\log 2}{\log n} +

1253: O\left(\frac{1}{(\log n)^2}\right).

1254: $$

1255: \end{theorem}

1256:

1257: \begin{proof}

1258: We view $G_n$ as the semidirect product $S_n \ltimes A^n$, and

1259: will use the three subgroups

1260: \begin{eqnarray*}

1261: H_1 & = & \{(\pi, 0) : \pi \in S_n\}, \\

1262: H_2 & = & \{(\pi, \pi u - u) : \pi \in S_n\}, \quad\textup{and} \\

1263: H_3 & = & \{(\pi, \pi v - v) : \pi \in S_n\},

1264: \end{eqnarray*}

1265: where $u = (1, 2, \dots, n)$, and $v = (n, n-1, \dots, 1)$.

1266:

1267: As each subgroup has size $n!$ in a group of size $n!(2n)^n$,

1268: $$

1269: \alpha \le \frac{\log (n!(2n)^n)}{\log n!},

1270: $$

1271: assuming the triple product property holds. The largest character

1272: degree of $G_n$ is $|S_n| = n!$ (see Theorem~25.6 in \cite{H})

1273: and so $|G|^{1/\gamma} = n!$, which implies

1274: $$

1275: \gamma = \frac{\log (n!(2n)^n)}{\log n!}.

1276: $$

1277: By Stirling's formula,

1278: $$

1279: \frac{\log (n!(2n)^n)}{\log n!} = 2 + \frac{1+\log 2}{\log n} +

1280: O\left(\frac{1}{(\log n)^2}\right),

1281: $$

1282: so all that remains is to verify the triple product property.

1283:

1284: Suppose $h_1 = (\pi',0) \in H_1$ and $h_2 = (\pi, \pi u -u ) \in

1285: H_2$.  Their product is $(\pi'\pi,\pi u - u)$, and if it equals

1286: $h_3 = (\sigma, \sigma v - v) \in H_3$, then $\pi u - u = \sigma

1287: v - v$. The $i$-th coordinate of $\pi u - u$ is $\pi(i) - i$, and

1288: that of $\sigma v - v$ is $(n+1 - \sigma(i)) - (n+1-i) =

1289: i-\sigma(i)$. Thus, $h_1h_2=h_3$ implies $\pi(i)+\sigma(i) = 2i$

1290: for all $i$. This is an equation in $A$, and hence holds only

1291: modulo $2n$. However, $\pi(i)$, $\sigma(i)$, and $i$ are all in

1292: $\{1,\dots,n\}$, so the equation holds in the integers as well.

1293: Because $\pi(1)$ and $\sigma(1)$ are both at least $1$, we

1294: conclude from $\pi(1)+\sigma(1)=2$ that $\pi(1)=\sigma(1)=1$.

1295: Then $\pi(2)$ and $\sigma(2)$ must be at least $2$, and

1296: $\pi(2)+\sigma(2)=4$, so $\pi(2)=\sigma(2)=2$, etc.  We conclude

1297: that $\pi$ and $\sigma$ are both trivial, as is $\pi'$ because

1298: $\pi'\pi=\sigma$.  Thus, $h_1=h_2=h_3=1$, as desired.

1299: \end{proof}

1300:

1301: This construction is an improvement over

1302: Theorem~\ref{theorem:pseudoexponent2}, because it achieves

1303: essentially the same pseudo-exponent bound, while at the same time

1304: $\alpha \le \gamma$. A more complicated variant of this

1305: construction achieves a comparable pseudo-exponent and has

1306: $\alpha < \gamma$.

1307:

1308: \SubSection{Direct products and the Sperner capacity}

1309: \label{section:direct}

1310:

1311: It is natural to attempt to improve the pseudo-exponent of a

1312: finite group $G$ by forming some group derived from it, such as a

1313: power $G^k$.  We know that $\gamma(G^k) = \gamma(G)$, so that

1314: parameter becomes no smaller.  Lemma~\ref{lemma:upperbound}

1315: implies that $\alpha(G^k) \le \alpha(G)$, and in this section we

1316: show that it is possible to achieve $\alpha(G^k) < \alpha(G)$.

1317:

1318: We will be led for the first time since

1319: Lemma~\ref{lemma:shortexact} to realize matrix multiplication

1320: through quotient sets that are {\em not} subgroups.

1321: Proposition~\ref{prop:dihedral} below proves that this

1322: complication is necessary to determine the pseudo-exponents of

1323: certain groups.

1324:

1325: Let $D_m$ be the dihedral group generated by $x$ and $y$, with the

1326: relations $y^2 = x^m = 1$ and $yxy = x^{-1}$.

1327:

1328: \begin{proposition}

1329: \label{prop:dihedral} For every $m$, $D_m$ realizes $\langle

1330: 2,2,2\lfloor m/3 \rfloor\rangle$, and hence $\alpha(D_m)<3$ for

1331: $m\ge 9$.  If $m$ is a prime greater than $3$, then no three

1332: subgroups prove $\alpha(D_m)<3$.

1333: \end{proposition}

1334:

1335: \begin{proof}

1336: Let $S_1 = \langle y \rangle$ be the subgroup generated by $y$,

1337: $S_2 = \langle yx^2 \rangle$, and $S_3 = \{x^{3k}, yx^{3k+1} : 0

1338: \le k < (m-2)/3\}$.  Then one can check by simple case analysis

1339: that $D_m$ realizes $\langle 2,2,2\lfloor m/3 \rfloor\rangle$

1340: through $S_1,S_2,S_3$.  Note that $S_3$ is a subgroup iff $m$ is a

1341: multiple of $3$.

1342:

1343: When $m$ is prime, all subgroups of $D_m$ have order $1$, $2$,

1344: $m$, or $2m$, and it is easy to rule out each case (except when

1345: $m=3$, in which case three subgroups of order~$2$ prove

1346: $\alpha(D_3)<3$).

1347: \end{proof}

1348:

1349: Proposition~\ref{prop:dihedral} is not optimal: $D_5$ realizes

1350: $\langle 2, 2, 3 \rangle$ through $\{1, y\}, \{1, yx\}, \{1, x^2,

1351: yx^4\}$.  However, we have checked using GAP that it is optimal

1352: for $m=4$, and thus $\alpha(D_4)=3$.

1353:

1354: We now use the combinatorial notion of {\em Sperner capacity} to

1355: show that $\alpha(D_4^k)<3$ for large $k$, despite the fact that

1356: $\alpha(D_4)=3$.

1357:

1358: \begin{proposition}

1359: If $S \subseteq (\Z/m\Z)^k$ is a subset in which no two distinct

1360: vectors differ by an element of $\{0,1\}^k$, then $D_m^k$ realizes

1361: $\langle 2^k, 2^k, |S|\rangle$.

1362: \end{proposition}

1363:

1364: \begin{proof}

1365: We identify $\Z/m\Z$ with the subgroup $\langle x \rangle

1366: \subseteq D_m$ (via $i \leftrightarrow x^i$), so that $S

1367: \subseteq \langle x \rangle^k \subseteq D_m^k$. The subgroups

1368: $\langle y \rangle$ and $\langle yx \rangle$ of $D_m$ have

1369: pointwise product $\langle y \rangle\langle yx \rangle =

1370: \{1,y,yx,x\}$.  Therefore the condition on differences of elements

1371: in $S$ implies that $\langle y \rangle^k$, $\langle yx

1372: \rangle^k$, and $S$ satisfy the triple product property, since

1373: $(\langle y \rangle^k \langle yx \rangle^k) \cap \langle x

1374: \rangle^k = \{1,x\}^k$, and $Q(S) \subseteq \langle x \rangle^k$

1375: avoids $\{1,x\}^k$.

1376: \end{proof}

1377:

1378: The problem of making $S$ as large as possible has been studied

1379: before;  a generalization of this problem is known as the Sperner

1380: capacity of a directed graph \cite{Garg,Korn}.  It is known that

1381: $|S| \le (m-1)^k$ (see Theorem~1.2 in \cite{Alon}, which extends

1382: several earlier papers \cite{Blok, Cald}), and that

1383: $$

1384: |S| = (m-1)^{(1-o(1))k}

1385: $$

1386: can be achieved by the following construction:

1387:

1388: Assume that $(m-1)$ divides $k$, and take $S$ to be the set of all

1389: vectors in $(\Z/m\Z)^k$ with exactly $k/(m-1)$ occurrences of each

1390: element of $\{0,1,\dots,m-2\}$. Now suppose we have $u, v \in S$

1391: with $u - v \in \{0,1\}^k$.  For each coordinate $i$ such that

1392: $u_i=0$, we have $v_i \in \{0,m-1\}$ because $u_i-v_i \in

1393: \{0,1\}$, and thus $v_i=0$.  Then whenever $u_i=1$, it follows

1394: that $v_i=1$, because all $k/(m-1)$ cases in which $v_i=0$ have

1395: $u_i=0$ as well. Repeating this argument yields $u=v$, as desired.

1396:

1397: We conclude that direct products {\em can} help:

1398:

1399: \begin{corollary}

1400: We have $\alpha(D_4^k) \le (3 + o(1)) \log_{12}{8}$, which

1401: approaches $3\log_{12}{8} = 2.51\dots$ as $k \rightarrow \infty$.

1402: \end{corollary}

1403:

1404: This pseudo-exponent bound comes tantalizingly close to settling

1405: Question~\ref{fundamentalq}: if

1406: $$

1407: \liminf_{k \to \infty} \alpha(D_4^k) < 3 \log_{12} 8,

1408: $$

1409: then the answer to the question is ``yes,'' and our methods do in

1410: fact prove $\omega<3$ (at least).  The same holds in general for

1411: $D_{2n}$ (which has $n-1$ characters of degree $2$ and $4$ of

1412: degree $1$) ; the Sperner capacity construction proves that

1413: $\alpha(D_{2n}^k) \le (3+o(1)) \log_{8n-4} 4n$, and if

1414: $$

1415: \liminf_{k \to \infty} \alpha(D_{2n}^k) < 3\log_{8n-4} 4n,

1416: $$

1417: then the answer to Question~\ref{fundamentalq} is ``yes.''

1418:

1419: Also, note that Lemma~\ref{lemma:upperbound} implies that for all

1420: $G$,

1421: $$

1422: \liminf_{k \to \infty} \alpha(G^k) = \inf_{k \ge 1} \alpha(G^k).

1423: $$

1424: Thus, even if the answer to Question~\ref{fundamentalq} is ``no,''

1425: there are combinatorial consequences.  For example, knowing that

1426: $\alpha(D_{2n}^k) \ge 3\log_{8n-4} 4n$ for all $n$ and $k$ would

1427: give a new proof of the Sperner capacity bound $|S| \le (m-1)^k$

1428: above, in the case of even $m$.

1429:

1430: \Section{Concluding comments} \label{section:conclusions}

1431:

1432: \SubSection{Open questions}

1433:

1434: The most pressing question arising in this paper is

1435: Question~\ref{fundamentalq}, which represents a potential barrier

1436: to obtaining non-trivial bounds on $\omega$ using our

1437: techniques.  However, there are numerous other open questions that

1438: are relevant to Question~\ref{fundamentalq} and the ultimate goal

1439: of proving $\omega=2$.

1440:

1441: \paragraph{Matrix groups.}

1442: As pointed out in Section~\ref{section:linear}, matrix groups

1443: seem to be one of the most promising families of examples, but we

1444: still know very little about them.  Can our bounds for

1445: $\alpha(SL_2(\F_q))$ be improved?  We see no reason why they

1446: should be optimal.  Recall that beating $9/4$ asymptotically would

1447: settle Question~\ref{fundamentalq}.  We know even less about

1448: $SL_n(\F_q)$ (only that $\alpha(SL_n(\F_q))<3$), so any

1449: non-trivial construction would be of interest.  The only other

1450: finite matrix groups that we have studied are those closely

1451: connected to $SL_n$ (such as $PSL_n$ or $GL_n$), but there are a

1452: number of other families.  What can one say about the

1453: pseudo-exponents of the groups in these families?

1454:

1455: \paragraph{Quotient sets.}

1456: The examples in Subsection~\ref{section:direct} show that

1457: quotient sets sometimes outperform subgroups.  For which groups

1458: does this occur?  Are there general constructions of useful

1459: quotient sets other than via Sperner capacity?  Can they be used

1460: to improve our constructions for $S_n$ or the wreath product?

1461: What about matrix groups?

1462:

1463: \paragraph{Lie groups.}

1464: Can one use Lie groups to prove anything about $\omega$

1465: directly?  Do results on the Lie pseudo-exponent imply anything

1466: about the pseudo-exponents of related finite groups?  Compact Lie

1467: groups seem more closely analogous to finite groups than

1468: non-compact Lie groups are, so studying them might be

1469: illuminating.  (All of the Lie groups in this paper are

1470: non-compact.)

1471:

1472: \paragraph{Group extensions.}

1473: Extensions of groups with pseudo-exponent $3$ can have

1474: substantially smaller pseudo-exponents, as demonstrated by the

1475: solvable groups in Subsection~\ref{section:solvable}.  (Recall

1476: that solvable groups are formed from abelian groups by taking

1477: repeated extensions.)  Is there a general way to lower $\alpha$ or

1478: raise $\gamma$ by taking extensions?  As a first step, can one

1479: find a family of solvable groups with pseudo-exponents tending to

1480: $2$?

1481:

1482: \paragraph{Powers of groups.}

1483: The simplest case of group extensions is taking powers of a

1484: group.  Given $G$, what can one say about the \textit{asymptotic

1485: pseudo-exponent\/} $\inf_{k \ge 1} \alpha(G^k)$ of $G$?  As noted

1486: in Subsection~\ref{section:direct}, $\gamma(G^k)=\gamma(G)$, so

1487: if there exists a group such that $\inf_{k \ge 1} \alpha(G^k) =

1488: 2$, then $\omega=2$ by Corollary~\ref{cor:race}.

1489:

1490: \SubSection{Extensions} \label{section:extensions}

1491:

1492: It is natural to attempt to extend our methods in various ways.

1493: For example, one might try to obtain bounds on border ranks of

1494: tensors, perhaps by using deformations of group algebras. It is

1495: also reasonable to ask whether our approach (given its reliance

1496: on representation theory) works in finite characteristic, as well

1497: as over $\C$. As Theorem~\ref{theorem:reduction} indicates, one

1498: can just as easily embed matrix multiplication into $F[G]$ rather

1499: than $\C[G]$, where $F$ has characteristic $p$. As long as $p$

1500: does not divide $|G|$, the representation theory of $G$, and all

1501: other aspects of our approach, work out identically, assuming $F$

1502: is algebraically closed. Sch\"onhage has shown that the exponent

1503: of matrix multiplication over arbitrary fields depends only on the

1504: characteristic (see Corollary~15.18 in \cite{BCS}), so we lose

1505: nothing by requiring that $F$ be algebraically closed.

1506:

1507: We conclude by mentioning a particular variant of our approach

1508: that does not require any control of the character degrees, and

1509: thus may still be viable even if there is a negative answer to

1510: Question~\ref{fundamentalq}. We have found less structure to make

1511: use of, and it seems less attractive, but it uses similar ideas.

1512: Suppose we have distinct elements $x_{i, j}, y_{k, \ell} \in G$,

1513: for $1\le i \le n$, $1\le j,k \le m$, and $1\le \ell \le p$, such

1514: that

1515: \begin{equation}

1516: x_{i,j}y_{j, \ell} \sim x_{i', k}y_{k', \ell'} \; \Leftrightarrow

1517: \; i=i', k=k', \ell = \ell', \label{eq:conjugacy}

1518: \end{equation}

1519: where $\sim$ denotes conjugacy of elements. Then we embed matrix

1520: $A = (a_{i, j})$ as $\bar{A} = \sum_{i, j} a_{i, j}x_{i, j} \in

1521: \C[G]$, and matrix $B = (b_{k, \ell})$ as $\bar{B} = \sum_{k,

1522: \ell} b_{k, \ell}y_{k, \ell} \in \C[G]$.  We can pursue a similar

1523: strategy to compute $AB$. In this case, however, in the Fourier

1524: domain, we need only to compute the {\em trace} of each of the

1525: matrix products in the block-diagonal matrix multiplication. That

1526: requires only $\sum_i d_i^2 = |G|$ multiplications, and so we can

1527: conclude that the rank of $\langle n,m,p \rangle$ is at most

1528: $|G|$.

1529:

1530: Let $G$ be a group with subsets $S_1, S_2$ and $S_3$ satisfying

1531: the triple product property. If we replace {\em conjugacy} with

1532: {\em equality} in \eqref{eq:conjugacy}, then it can be satisfied

1533: by taking $\{x_{i, j}\} = S_1 S_2^{-1}$ (where $i$ indexes $S_1$

1534: and $j$ indexes $S_2$) and $\{y_{k, \ell}\} = S_2 S_3^{-1}$ ($k$

1535: indexes $S_2$ and $\ell$ indexes $S_3$), so it is possible that

1536: the techniques we have developed in this paper could help with

1537: this variant as well, although in general we find it difficult to

1538: work with conjugacy constraints.

1539:

1540: \section*{Acknowledgements}

1541:

1542: We are grateful to Michael Aschbacher, Noam Elkies, Bobby

1543: Kleinberg, L\'aszl\'o Lov\'asz, Amin Shokrollahi, David Vogan, and

1544: Avi Wigderson for helpful discussions.

1545:

1546: \begin{thebibliography}{10}\setlength{\itemsep}{-1ex}\small

1547:

1548: \bibitem{Alon}

1549: N.~Alon.

1550: \newblock On the capacity of digraphs.

1551: \newblock {\em European J.\ Combinatorics}, 19:1--5, 1998.

1552:

1553: \bibitem{Blok}

1554: A.~Blokhuis.

1555: \newblock On the {Sperner} capacity of the cyclic triangle.

1556: \newblock {\em J.\ Algebraic Combinatorics}, 2:123--124, 1993.

1557:

1558: \bibitem{Brown}

1559: K.~S. Brown.

1560: \newblock {\em Cohomology of Groups}.

1561: \newblock Number~87 in Graduate Texts in Mathematics. Springer-Verlag, 1982.

1562:

1563: \bibitem{BCS}

1564: P.~{B\"urgisser}, M.~Clausen, and M.~A. Shokrollahi.

1565: \newblock {\em Algebraic Complexity Theory}, volume 315 of {\em Grundlehren der

1566:   mathematischen Wissenschaften}.

1567: \newblock Springer-Verlag, 1997.

1568:

1569: \bibitem{Cald}

1570: R.~Calderbank, P.~Frankl, R.~L. Graham, W.~Li, and L.~Shepp.

1571: \newblock The {Sperner} capacity of the cyclic triangle for linear and

1572:   nonlinear codes.

1573: \newblock {\em J.\ Algebraic Combinatorics}, 2:31--48, 1993.

1574:

1575: \bibitem{CW}

1576: D.~Coppersmith and S.~Winograd.

1577: \newblock Matrix multiplication via arithmetic progressions.

1578: \newblock {\em J. Symbolic Computation}, 9:251--280, 1990.

1579:

1580: \bibitem{GAP}

1581: The GAP~Group.

1582: \newblock {\em {GAP -- Groups, Algorithms, and Programming, Version 4.3}},

1583:   2002.

1584: \newblock \texttt{(http://www.gap-\break system.org)}.

1585:

1586: \bibitem{Garg}

1587: L.~Gargano, J.~{K\"orner}, and U.~Vaccaro.

1588: \newblock {Sperner} theorems on directed graphs and qualitative independence.

1589: \newblock {\em J.\ Combinatorial Theory Series A}, 61:173--192, 1992.

1590:

1591: \bibitem{Green}

1592: J.~A. Green.

1593: \newblock The characters of the finite general linear groups.

1594: \newblock {\em Transactions of the American Mathematical Society}, 80:402--447,

1595:   1955.

1596:

1597: \bibitem{H}

1598: B.~Huppert.

1599: \newblock {\em Character Theory of Finite Groups}.

1600: \newblock Number~25 in de Gruyter Expositions in Mathematics. Walter de

1601:   Gruyter, Berlin, 1998.

1602:

1603: \bibitem{JL}

1604: G.~James and M.~Liebeck.

1605: \newblock {\em Representations and Characters of Groups}.

1606: \newblock Cambridge University Press, Cambridge, second edition, 2001.

1607:

1608: \bibitem{Korn}

1609: J.~{K\"orner} and G.~Simonyi.

1610: \newblock A {Sperner}-type theorem and qualitative independence.

1611: \newblock {\em J.\ Combinatorial Theory}, 59:90--103, 1992.

1612:

1613: \bibitem{LR}

1614: J.~Lafferty and D.~Rockmore.

1615: \newblock Fast fourier analysis for {$SL_2$} over a finite field and related

1616:   numerical experiments.

1617: \newblock {\em Experimental Mathematics}, 1:115--139, 1992.

1618:

1619: \bibitem{S}

1620: V.~Strassen.

1621: \newblock Gaussian elimination is not optimal.

1622: \newblock {\em Numerical Mathematics}, 13:354--356, 1969.

1623:

1624: \bibitem{VK}

1625: A.~M. Vershik and S.~V. Kerov.

1626: \newblock Asymptotics of the largest and the typical dimensions of irreducible

1627:   representations of a symmetric group.

1628: \newblock {\em Functional Analysis and its Applications}, 19:21--31, 1985.

1629:

1630: \end{thebibliography}

1631:

1632: \end{document}

1633: