abstract:7e5480ec27b60579.tex

1: \begin{abstract}%

2:     Node regression consists in predicting the value of a graph label at a node, given observations at the other nodes.

3:     To gain some insight into the performance of various estimators for this task,

4:     we perform a theoretical study in a context where the graph is random.

5:     Specifically, we assume that the graph is generated by a Latent Position Model, where each node of the graph has a latent position, and the probability

6:     that two nodes are connected depend on the distance between the latent positions of the two nodes.

7:

8:     In this context, we begin by studying the simplest possible estimator for graph regression,

9:     which consists in averaging the value of the label at all neighboring nodes.

10:     We show that in Latent Position Models this estimator tends to a Nadaraya-Watson estimator in the latent space,

11:     and that its rate of convergence is in fact the same.

12:

13:     One issue with this standard estimator is that it averages over a region consisting of all neighbors of a node,

14:     and that depending on the graph model

15:     this may be too much or too little.

16:     An alternative consists in first estimating the ``true'' distances between the latent positions,

17:     then injecting these estimated distances into a classical Nadaraya-Watson estimator.

18:     This enables averaging in regions either smaller or larger than the typical graph neighborhood.

19:     We show that this method can achieve standard nonparametric rates in certain instances even when

20:     the graph neighborhood is too large or too small.

21:     \end{abstract}

22: