1: \begin{abstract}
2: Deep neural networks based on state space models (SSMs) are attracting much attention in sequence modeling
3: since their computational cost is significantly smaller than that of Transformers.
4: While the capabilities of SSMs have been primarily investigated through experimental comparisons,
5: theoretical understanding of SSMs is still limited.
6: In particular, there is a lack of statistical and quantitative evaluation of whether SSM can replace Transformers.
7: In this paper, we theoretically explore in which tasks SSMs can be alternatives of Transformers
8: from the perspective of estimating sequence-to-sequence functions.
9: We consider the setting where the target function has direction-dependent smoothness
10: and prove that SSMs can estimate such functions with the same convergence rate as Transformers.
11: Additionally, we prove that SSMs can estimate the target function, even if the smoothness changes depending on the input sequence, as well as Transformers.
12: Our results show the possibility that SSMs can replace Transformers when estimating the functions in certain classes that appear in practice.
13: \end{abstract}
14: