1: \begin{abstract}
2: The commonly cited rule of thumb for regression analysis, which suggests
3: that a sample size of \(n \geq 30\) is sufficient to ensure valid
4: inferences, is frequently referenced but rarely scrutinized. This
5: research note evaluates the lower bound for the number of observations
6: required for regression analysis by exploring how different
7: distributional characteristics, such as skewness and kurtosis, influence
8: the convergence of t-values to the t-distribution in linear regression
9: models. Through an extensive simulation study involving over 22 billion
10: regression models, this paper examines a range of symmetric,
11: platykurtic, and skewed distributions, testing sample sizes from 4 to
12: 10,000. The results reveal that it is sufficient that either the
13: dependent or independent variable follow a symmetric distribution for
14: the t-values to converge to the t-distribution at much smaller sample
15: sizes than \(n=30\). This is contrary to previous guidance which
16: suggests that the error term needs to be normally distributed for this
17: convergence to happen at low \(n\). On the other hand, if both dependent
18: and independent variables are highly skewed the required sample size is
19: substantially higher. In cases of extreme skewness, even sample sizes of
20: 10,000 do not ensure convergence. These findings suggest that the
21: \(n\geq30\) rule is too permissive in certain cases but overly
22: conservative in others, depending on the underlying distributional
23: characteristics. This study offers revised guidelines for determining
24: the minimum sample size necessary for valid regression analysis.
25: \end{abstract}
26: