1: \begin{abstract}
2: %$g(x) = \Ep [Y_{\eta_0} | X=x]$
3: This paper provides estimation and inference methods for a structural function, such as Conditional Average Treatment Effect (CATE), based on modern machine learning (ML) tools. We assume that such function can be represented as a conditional expectation $g(x) = \Ep [Y_{\eta_0} | X=x]$ of a signal $Y_{\eta_0}$, where $\eta_0$ is the unknown nuisance function. In addition to CATE, examples of such functions include regression function with Partially Missing Outcome and Conditional Average Partial Derivative. We approximate $g(x)$ by a linear form $p(x)'\beta_0$, where $p(x)$ is a vector of the approximating functions and $\beta_0$ is the Best Linear Predictor. Plugging in the first-stage estimate $\hat{\eta}$ into the signal $ Y_{\hat{\eta}}$, we estimate $\beta_0$ via ordinary least squares of $Y_{\hat{\eta}}$ on $p(X)$. We deliver a high-quality estimate $p(x)'\hat{\beta}$ of the pseudo-target function $p(x)'\beta_0$, that features (a) a pointwise Gaussian approximation of $p(x_0)'\beta_0$ at a point $x_0$, (b) a simultaneous Gaussian approximation of $p(x)'\beta_0$ uniformly over $x$, and (c) optimal rate of convergence of $p(x)'\hat{\beta}$ to $p(x)'\beta_0$ uniformly over $x$. In the case the misspecification error of the linear form decays sufficiently fast, these approximations automatically hold for the target function $g(x)$ instead of a pseudo-target $p(x)'\beta_0$. The first stage nuisance parameter $\eta_0$ is allowed to be high-dimensional and is estimated by modern ML tools, such as neural networks, $l_1$-shrinkage estimators, and random forest. Using our method, we estimate the average price elasticity conditional on income using \cite{YatchewNo} data and provide uniform confidence bands for the target regression function.
4: \end{abstract}
5: