1: \begin{abstract}%
2: Node regression consists in predicting the value of a graph label at a node, given observations at the other nodes.
3: To gain some insight into the performance of various estimators for this task,
4: we perform a theoretical study in a context where the graph is random.
5: Specifically, we assume that the graph is generated by a Latent Position Model, where each node of the graph has a latent position, and the probability
6: that two nodes are connected depend on the distance between the latent positions of the two nodes.
7:
8: In this context, we begin by studying the simplest possible estimator for graph regression,
9: which consists in averaging the value of the label at all neighboring nodes.
10: We show that in Latent Position Models this estimator tends to a Nadaraya-Watson estimator in the latent space,
11: and that its rate of convergence is in fact the same.
12:
13: One issue with this standard estimator is that it averages over a region consisting of all neighbors of a node,
14: and that depending on the graph model
15: this may be too much or too little.
16: An alternative consists in first estimating the ``true'' distances between the latent positions,
17: then injecting these estimated distances into a classical Nadaraya-Watson estimator.
18: This enables averaging in regions either smaller or larger than the typical graph neighborhood.
19: We show that this method can achieve standard nonparametric rates in certain instances even when
20: the graph neighborhood is too large or too small.
21: \end{abstract}
22: