957cc4d36b019e44.tex
1: \begin{abstract}
2: %$g(x) = \Ep [Y_{\eta_0} | X=x]$ 
3: This paper provides estimation and inference methods for a structural function, such as Conditional Average Treatment Effect (CATE),  based on modern machine learning (ML) tools.  We assume that such function can be represented as a  conditional expectation $g(x) = \Ep [Y_{\eta_0} | X=x]$ of a signal $Y_{\eta_0}$, where $\eta_0$ is the unknown nuisance function. In addition to CATE, examples of such functions include regression function with Partially Missing Outcome and Conditional Average Partial Derivative.  We approximate $g(x)$ by a linear form $p(x)'\beta_0$, where  $p(x)$ is a vector of the approximating functions and $\beta_0$ is the Best Linear Predictor. Plugging in  the first-stage estimate $\hat{\eta}$  into the signal $ Y_{\hat{\eta}}$, we estimate  $\beta_0$ via ordinary least squares of $Y_{\hat{\eta}}$  on $p(X)$. We deliver  a high-quality estimate $p(x)'\hat{\beta}$  of the pseudo-target function $p(x)'\beta_0$, that features (a) a pointwise Gaussian approximation of $p(x_0)'\beta_0$ at a point $x_0$, (b) a simultaneous  Gaussian approximation of $p(x)'\beta_0$ uniformly over  $x$, and (c) optimal  rate of convergence of $p(x)'\hat{\beta}$ to $p(x)'\beta_0$ uniformly over $x$. In the case the misspecification error of the linear form decays sufficiently fast, these approximations automatically hold for the target function $g(x)$ instead of a pseudo-target $p(x)'\beta_0$. The first stage nuisance parameter $\eta_0$ is allowed to be high-dimensional and is estimated by modern ML tools, such as neural networks, $l_1$-shrinkage estimators, and random forest. Using our method, we estimate the average price elasticity conditional on income using \cite{YatchewNo} data and provide uniform confidence bands for the target regression function.
4:   \end{abstract}
5: