7e5480ec27b60579.tex
1: \begin{abstract}%
2:     Node regression consists in predicting the value of a graph label at a node, given observations at the other nodes. 
3:     To gain some insight into the performance of various estimators for this task, 
4:     we perform a theoretical study in a context where the graph is random. 
5:     Specifically, we assume that the graph is generated by a Latent Position Model, where each node of the graph has a latent position, and the probability 
6:     that two nodes are connected depend on the distance between the latent positions of the two nodes. 
7: 
8:     In this context, we begin by studying the simplest possible estimator for graph regression, 
9:     which consists in averaging the value of the label at all neighboring nodes. 
10:     We show that in Latent Position Models this estimator tends to a Nadaraya-Watson estimator in the latent space, 
11:     and that its rate of convergence is in fact the same. 
12: 
13:     One issue with this standard estimator is that it averages over a region consisting of all neighbors of a node, 
14:     and that depending on the graph model 
15:     this may be too much or too little. 
16:     An alternative consists in first estimating the ``true'' distances between the latent positions, 
17:     then injecting these estimated distances into a classical Nadaraya-Watson estimator. 
18:     This enables averaging in regions either smaller or larger than the typical graph neighborhood.  
19:     We show that this method can achieve standard nonparametric rates in certain instances even when 
20:     the graph neighborhood is too large or too small. 
21:     \end{abstract}
22: