dac8dcf1fb59ed05.tex
1: \begin{abstract}
2:   Deep neural networks based on state space models (SSMs) are attracting much attention in sequence modeling 
3:   since their computational cost is significantly smaller than that of Transformers. 
4:   While the capabilities of SSMs have been primarily investigated through experimental comparisons, 
5:   theoretical understanding of SSMs is still limited. 
6:   In particular, there is a lack of statistical and quantitative evaluation of whether SSM can replace Transformers.
7:   In this paper, we theoretically explore in which tasks SSMs can be alternatives of Transformers 
8:   from the perspective of estimating sequence-to-sequence functions. 
9:   We consider the setting where the target function has direction-dependent smoothness
10:   and prove that SSMs can estimate such functions with the same convergence rate as Transformers.
11:   Additionally, we prove that SSMs can estimate the target function, even if the smoothness changes depending on the input sequence, as well as Transformers.
12:   Our results show the possibility that SSMs can replace Transformers when estimating the functions in certain classes that appear in practice.
13: \end{abstract}
14: