Two Graph Representation PerspectivesΒΆ
When we look at nodes in a graph, there are fundamentally two ways to see them: by their position in the graph, or by their structural role. This distinction is at the heart of our discussion today.
1. Preliminaries:ΒΆ
1.1. Adjacency Matrix and Adjacency TensorΒΆ
The adjacency matrix $\mathbf{A} \in \mathbb{A}^{n \times n}$ is a square matrix whose elements indicate whether pairs of vertices have a relationship in some appropriate space $\mathbb{A}$.
- In the simplest case, $\mathbb{A} = \{0,1\}$, and $A_{ij} = 1$ if there is a connection from node $i$ to $j$, and otherwise $A_{ij} = 0$ (indicating a non-edge).
- But we also allow $\mathbb{A} \subseteq \mathbb{R}^d$, $d \geq 1$. This makes the adjacency matrix into a adjacency tensor ${\bf A}$.
- Self-loops, e.g., $\mathbf{A}_{ii}$, will encode node attributes (self-relations).
- For an undirected graph, $\mathbf{A}$ is a symmetric matrix ($\mathbf{A}_{ij}=\mathbf{A}_{ji}$). For the example graph below has adjacency matrix:
$$ \mathbf{A} = \begin{bmatrix} 0 & 1 & 0 & 0\\ 1 & 0 & 1 & 1\\ 0 & 1 & 0 & 1\\ 0 & 1 & 1 & 0 \end{bmatrix}$$
1.2. Permutation action on graphsΒΆ
- Let $\mathbb{S}_n$ denote the permutation group of $n \geq 2$ elements.
- A group action is the application of a group representation of an element of the group onto some object.
- For instance, the group action "$\circ$" of permutation $\pi \in \mathbb{S}_n$ on the adjacency matrix is denoted $\pi \circ \mathbf{A}$ and jointly permutes the rows and columns of $\mathbf{A}$ according to the permutation in $\pi$.
- We say ${\bf A}$ and $\pi \circ {\bf A}$ isomorphic graphs, where $\pi \in \mathbb{S}_n$ is an element of the symmetric group of order $n$ and "$\circ$" is the appropriate action (representation) over the input.
1.3. Node IsomorphismΒΆ
Definition: Two nodes $u, v \in V$, $u \neq v$, are isomorphic (denoted $ u \simeq v $) if there exists a non-trivial permutation $\pi \in \mathbb{S}_n$ (i.e., not identity), such that $\pi \circ \mathbf{A} = \mathbf{A}$ and $\pi \circ u = v$, $\pi \circ v = u$.
- Where $\mathbb{S}_n$ is the symmetric group
- And $\pi \circ \mathbf{A}$ is the action of the permutation $\pi$ on adjacency matrix $\mathbf{A}$.
Consider this simple example from (Zhang et al., 2021)
Here, $v_1 \simeq v_4$ and $v_2 \simeq v_3$. That is, swapping these pairs and the graph would remain unchanged.
This symmetry creates a fundamental challenge for learning on graphs.
- If node ids are arbitrary, then isomorphic nodes are indistinguishable.
2. Structural Node RepresentationsΒΆ
2.1. Structural Node Representation DefinitionΒΆ
Node embeddings that are equivariant to permutations of the node ids are called structural node representations (Shrinivasan & R., 2020).
Assume we have a permutation-equivariant function $\text{GNN}: \mathbb{A}^{n\times n} \to \mathbb{R}^{n \times d}$ that produces a representation of node $v \in V$ in graph ${\bf A}$, $d \geq 1$.
Let's consider the our graph.
- As $\text{GNN}$ is equivariant to permutations of the vertex ids:
- That is, $\text{GNN}(\pi \circ {\bf A}) = \pi \circ \text{GNN}({\bf A}).$
- Let $Z_v = \text{GNN}({\bf A})_v$ be the a permutation-invariant representation of node $v \in V$.
- In the figure, representations with the same colors have the same representation vectors.
- As $\text{GNN}$ is equivariant to permutations of the vertex ids:
Node representations that are invariant to permutations are denoted structural node representations.
- Because nodes that are structurally equivalent on the graph must get the same representations
To run this code you will need this file graph_plot.py
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch_geometric.nn import GCNConv
from torch_geometric.data import Data
import numpy as np
# Import our graph plotting library
import graph_plot as gp
gp.set_seed(42)
# Define the connections
connections = [
(7, 6), (7, 8), (8, 9), (9, 10), (10, 11),
(11, 6), (6, 0), (0, 1), (1, 2), (2, 3),
(3, 4), (4, 5), (5, 0), (5, 11)
]
# Create the graph and adjacency matrix
G, adjacency_matrix = gp.create_graph_from_connections(connections)
# Prepare data for PyTorch Geometric
# Create edge_index from connections
edge_index = []
for i, j in connections:
edge_index.append([i, j])
edge_index.append([j, i]) # Add both directions for undirected graph
edge_index = torch.tensor(edge_index, dtype=torch.long).t().contiguous()
###########################################
## PAY ATTENTION HERE: No node features
###########################################
# Create constant node features (all ones)
x = torch.ones((12, 1), dtype=torch.float)
# Create PyTorch Geometric Data object
data = Data(x=x, edge_index=edge_index)
# Define a GCN model using PyTorch Geometric
class GCN(nn.Module):
def __init__(self, in_channels, hidden_channels, out_channels):
super(GCN, self).__init__()
self.conv1 = GCNConv(in_channels, hidden_channels)
self.conv2 = GCNConv(hidden_channels, hidden_channels)
self.conv3 = GCNConv(hidden_channels, hidden_channels)
self.conv4 = GCNConv(hidden_channels, out_channels)
def forward(self, x, edge_index):
x = self.conv1(x, edge_index)
x = F.relu(x)
x = F.dropout(x, p=0.1, training=self.training)
x = self.conv2(x, edge_index)
x = F.relu(x)
x = F.dropout(x, p=0.1, training=self.training)
x = self.conv3(x, edge_index)
x = F.relu(x)
x = F.dropout(x, p=0.1, training=self.training)
x = self.conv4(x, edge_index)
return x
# Initialize and run the model
model = GCN(in_channels=1, hidden_channels=8, out_channels=3)
model.eval() # Set to evaluation mode
with torch.no_grad():
# Forward pass through the GCN to get node embeddings
node_embeddings = model(data.x, data.edge_index).numpy()
# Get the layout positions
pos = gp.get_layout_positions('custom')
# Plot the graph with GCN embeddings
gp.plot_graph_with_embeddings(G, node_embeddings, pos,
title="Graph with GCN Embeddings (No Features)")
# Plot individual embedding dimensions
gp.plot_embedding_dimensions(G, node_embeddings, pos)
2.2. Expresiveness Limitations of Most-Expressive GNNsΒΆ
Definition: Most-expressive node-equivariant GNNs
- Let $\text{GNN}_\mathbf{W}$ denote a Graph Neural Network with neural weights $\mathbf{W}$.
- We say $\text{GNN}_\mathbf{W}$ outputs most-expressive node representations if
- $\exists \mathbf{W}$ such that $\forall \mathbf{A} \in \mathbb{A}$, $\forall \pi \in \mathbb{S}_n$, $\pi \circ \text{GNN}_\mathbf{W}({\bf A}) = \text{GNN}_\mathbf{W}({\bf A}') \iff \pi \circ {\bf A} = {\bf A}'$.
In other words, two nodes have the same representation if and only if they are structurally indistinguishable under the action of graph automorphisms.
- $\text{Aut}(G) = \{\pi \in \mathbb{S}_n : A_{u, v} = A_{\pi \circ u, \pi \circ v}\}$
Lemma (Shrinivasan & R., 2020): If $\text{GNN}_\mathbf{W}$ is a most-expressive node-equivariant GNN, then its node representations are not most expressive for edges.
Why: A most-expressive node-equivariant GNN can perfectly fit $P(Y_v | \mathbf{A},v)$ and $P(Y_u | \mathbf{A},u)$ in training, but unless $P(Y_v, Y_u | \mathbf{A},u,v) = P(Y_u | \mathbf{A},u)P(Y_v | \mathbf{A},v)$, it cannot perfectly fit joint node properties.
Proof. By construction:
2.3. How to perform link prediction with GNNs?ΒΆ
- Break GNN node-equivariance by adding a special node feature to the "source" node in the link prediction
- Denoted as symmetry-breaking featurization in the literature
- GNN is no longer node-equivariant
# import torch
import torch.nn as nn
import torch.nn.functional as F
from torch_geometric.nn import GCNConv
from torch_geometric.data import Data
import numpy as np
# Import our graph plotting library
import graph_plot as gp
gp.set_seed(42)
# Define the connections
connections = [
(7, 6), (7, 8), (8, 9), (9, 10), (10, 11),
(11, 6), (6, 0), (0, 1), (1, 2), (2, 3),
(3, 4), (4, 5), (5, 0), (5, 11)
]
# Create the graph and adjacency matrix
G, adjacency_matrix = gp.create_graph_from_connections(connections)
# Prepare data for PyTorch Geometric
# Create edge_index from connections
edge_index = []
for i, j in connections:
edge_index.append([i, j])
edge_index.append([j, i]) # Add both directions for undirected graph
edge_index = torch.tensor(edge_index, dtype=torch.long).t().contiguous()
###########################################
## PAY ATTENTION HERE: Node 0 will have a different feature to break symmetries
###########################################
# Create constant node features (all ones)
x = torch.ones((12, 2), dtype=torch.float)
# Except for node 0 (BREAKS SYMMETRIES)
x[:,1] = 0
x[0,1] = 1
##########################################
# Create PyTorch Geometric Data object
data = Data(x=x, edge_index=edge_index)
# Define a GCN model using PyTorch Geometric
class GCN(nn.Module):
def __init__(self, in_channels, hidden_channels, out_channels):
super(GCN, self).__init__()
self.conv1 = GCNConv(in_channels, hidden_channels)
self.conv2 = GCNConv(hidden_channels, hidden_channels)
self.conv3 = GCNConv(hidden_channels, hidden_channels)
self.conv4 = GCNConv(hidden_channels, out_channels)
def forward(self, x, edge_index):
x = self.conv1(x, edge_index)
x = F.relu(x)
x = F.dropout(x, p=0.1, training=self.training)
x = self.conv2(x, edge_index)
x = F.relu(x)
x = F.dropout(x, p=0.1, training=self.training)
x = self.conv3(x, edge_index)
x = F.relu(x)
x = F.dropout(x, p=0.1, training=self.training)
x = self.conv4(x, edge_index)
return x
# Initialize and run the model
model = GCN(in_channels=2, hidden_channels=8, out_channels=3)
model.eval() # Set to evaluation mode
with torch.no_grad():
# Forward pass through the GCN to get node embeddings
node_embeddings = model(data.x, data.edge_index).numpy()
# Get the layout positions
pos = gp.get_layout_positions('custom')
# Plot the graph with GCN embeddings
gp.plot_graph_with_embeddings(G, node_embeddings, pos,
title="Graph with GCN Embeddings (Special Feature on Node \"0\")")
# Plot individual embedding dimensions
gp.plot_embedding_dimensions(G, node_embeddings, pos)
The above approach is known as:
Labeling Trick (Zhang et al., 2021)ΒΆ
- For link prediction between a source node $s \in V$ and other nodes in the graph, we can transform the problem by marking the source node, giving it a unique node feature:
This approach is particularly elegant.
By marking node $v$ with a unique feature, information about this source node propagates through the graph during the GNN message passing algorithm.
Each node's structural embedding now encodes its relationship to the source node.
When we predict the link between nodes $v$ and $u$, we mark $v$ as the source node and examine the GNN representation of $u$.
- This representation is no longer permutation-equivariant (because of the marking of node $v$) but contains information about the relationship between $u$ and $v$, even if $v$ is isomorphic to other nodes.
- This approach is equivariant to pairs of nodes: that is, if two pairs of nodes are "isomorphic" to another two pairs, then this procedure will give them the same pairwise representations.
This approach is widely used but not quite theoretically correct
- There is one issue (very hard to spot and not that important in practice), can you spot it?
2.4.1. Understanding symmetry-breaking for link predictionΒΆ
Why do these approaches work? Let us get some deeper intuition:
With standard GNNs, isomorphic nodes $u \simeq v$ will have identical representations: $$\text{GNN}(\mathbf{A})_u = \text{GNN}(\mathbf{A})_v$$
When we add symmetry-breaking features, we transform the node features $X$ to $X'$ such that nodes $u$ and $v$ now have distinct features: $$X'_u \neq X'_v$$
The GNN now processes this augmented graph: $$\text{GNN}(\mathbf{A}, X')_u \neq \text{GNN}(\mathbf{A}, X')_v$$
For link prediction specifically, marking a source node $u$ creates embeddings that are contextual to that node: $$\text{GNN}(\mathbf{A}, X \oplus \mathbb{1}_u),$$ where $\mathbb{1}_u$ is a one-hot encoding of the source node id, and $\oplus$ represents feature concatenation.
The message passing process propagates this source information throughout the graph, creating node representations that encode relationships to the source node.
- This allows us to maintain the power of GNNs for capturing graph structure while adding the ability to create joint representations between two nodes.
2.5. Non-node-equivariant GNNs challenges for node classification/regressionΒΆ
- If GNN is not node-equivariant then node classification/regression prediction is sensitive to permutation $\pi \in \mathbb{S}_n$.
- For $u \simeq v$, we could have $GNN(\mathbf{A})_u \neq GNN(\mathbf{A})_v$.
- Consider a downstream classifier $f : \mathbb{R}^d \to \mathbb{Y}$, where $Y_v \in \mathbb{Y}$ is the target label of node $v \in V$.
- Then, $\exists f$ such that:
2.5. All graph tasks require equivariant representations at some levelΒΆ
More precisely: All $k$-node Tasks Require $k$-node EquivariancesΒΆ
Theorem (Task Equivariances) [summary]: The neural networks that can learn tasks over $k \in \{1,...,n\}$ nodes should have distinct equivariances depending on $k$.
Theorem (Task Equivariances): Let $G = (V, E, X)$ be an attributed graph, with $V = \{1,..., n\}$ as nodes, $E$ as edges, $X$ as node and edge attributes. Let $S_k \subseteq V$ be a set of $k$ nodes (w.l.o.g. $S_k = \{1,...,k\}$). Consider a random variable $Y_{S_k}$ encompassing the nodes in $S_k$ that we wish to learn with a neural network $f$ via supervised learning: $P(Y_{S_k} | S_k, G) = f(S_k, G)$. Then, $f$ must be described by two permutation groups: the normal subgroup $\mathbb{S}_k$ that defines the equivariances of $f$ related to the nodes $S_k$ in the task and the normal subgroup $\mathbb{S}_n\backslash\mathbb{S}_k$ which describes the invariance of $f$ to all remaining nodes in the graph.
R., "A Mathematical Framework for Graph Foundation Models", in preparation
Consequences (in the literature):ΒΆ
- GNNs designed for node classification/regression are rarely used for link prediction
- GNNs designed for link prediction are rarely used for node classification/regression
- GNNs designed for higher-order tasks are rarely used in lower-order tasks and vice-versa
3. Positional Node Representations (or, what about eigenvectors?)ΒΆ
3.1. Definition: Positional Node RepresentationsΒΆ
Definition (Positional node representations (Shrinivasan & R., 2020)) Positional node representations of a graph $G$ are defined as joint samples of random variables $(\mathbf{Z}_i)_{i \in V} | \mathbf{A} \sim p(\cdot | \mathbf{A})$, $\mathbf{Z}_i \in \mathbb{R}^d$, $d \geq 1$, where $p(\cdot | \mathbf{A})$ is a $\mathbb{S}_n$-equivariant probability distribution on $\mathbf{A}$. That is, $$(\mathbf{Z}_1, \ldots, \mathbf{Z}_n) | \mathbf{A} \stackrel{d}{=} (\mathbf{Z}_{\pi 1}, \ldots, \mathbf{Z}_{\pi n}) | (\pi \circ \mathbf{A}), \quad \pi \in \mathbb{S}_n,$$ where $\stackrel{d}{=}$ means equivalent in distribution.
- Positional rode representations can be permutation-sensitive
- Strictly positional node representations are positional node representations that are not also structural node representations. These are permutation sensitive.
Theorem (informal) (Shrinivasan & R., 2020): The node representations $Z_1,\ldots,Z_n$ of a graph $G$ with $n$ nodes are called positional node representations if they are equivariant in distribution, that is, $\exists f$, s.t. $Z_1,\ldots,Z_n = f(\mathbf{A},\epsilon)$ and $f(\pi \circ \mathbf{A},\epsilon_\pi) = \pi \circ f(\mathbf{A},\epsilon_\pi)$, $\forall \pi \in \mathbb{S}_n$, with $\epsilon_\pi \sim \text{Unif}(0,1)$ that may depend on $\pi$.
3.1.1. A Probabilistic View of Positional Node RepresentationsΒΆ
Here's a perspective that changed how I think about graph embeddings
Say we are using a graph whose node IDs have been thoroughly shuffled.
That the adjacency matrix $\mathbf{A}$ we observe is actually the result of shuffling a "canonical" adjacency matrix $\mathbf{A}'$. This means, $\mathbf{A} = \pi \circ \mathbf{A}'$, where $\pi \sim \text{Unif}(\mathbb{S}_n)$ is a uniformly random permutation.
A positional node representation is a function of a specific permutation arrangement
- Positional node representations are samples from a joint probability distribution $p(\mathbf{Z}_1, \dots, \mathbf{Z}_n | \mathbf{A}')$, where $\mathbf{Z}_i$ represents the positional representation of node $i$.
- These representations are random because $\pi \sim \text{Unif}(\mathbb{S}_n)$ is random.
3.2. Eigenvectors are positional node representationsΒΆ
Note (Shrinivasan & R., 2020): Eigenvectors give positional node representations
E.g., Computing eigenvectors with Lanczos algorithm, $\epsilon$ is the initial Lanczos vector.
E.g., Computing eigenvectors with QR decomposition (e.g., via GramβSchmidt process) will output different eigenvectors depending on the order of the nodes in $\mathbf{A}$.
- But if we randomly permute $\mathbf{A}$, we will get distinct eigenvectors from the upper-triangularization.
- The randomness is encoded in the arbitrary order of the nodes in $\mathbf{A}$.
import numpy as np
from scipy import linalg
import matplotlib.pyplot as plt
import networkx as nx
import graph_plot as gp
gp.set_seed(100)
# Define the connections
connections = [
(7, 6), (7, 8), (8, 9), (9, 10), (10, 11),
(11, 6), (6, 0), (0, 1), (1, 2), (2, 3),
(3, 4), (4, 5), (5, 0), (5, 11)
]
# Create the graph and adjacency matrix
G, adjacency_matrix = gp.create_graph_from_connections(connections)
# Compute the degree matrix
degree_matrix = np.diag(np.sum(adjacency_matrix, axis=1))
# Compute the Laplacian matrix (L = D - A)
laplacian_matrix = degree_matrix - adjacency_matrix
# Compute eigenvalues and eigenvectors of the Laplacian
eigenvalues, eigenvectors = linalg.eigh(laplacian_matrix)
# Sort eigenvalues by magnitude and get corresponding eigenvectors
idx = np.abs(eigenvalues).argsort()
eigenvalues = eigenvalues[idx]
eigenvectors = eigenvectors[:, idx]
# Get the last three eigenvectors (top eigenvectors corresponding to largest eigenvalues)
top_eigenvectors = eigenvectors[:, -3:]
# Get the layout positions
pos = gp.get_layout_positions('custom')
# Plot the graph with eigenvector-based colors
gp.plot_graph_with_embeddings(G, top_eigenvectors, pos,
title="Graph with Nodes Colored by Top 3 Eigenvectors")
gp.plot_embedding_dimensions(G, top_eigenvectors, pos)
3.3. Connection between Positional and Structural Node RepresentationsΒΆ
Now, let's consider the marginal distribution for a single node: $$ p(\mathbf{Z}_v | \mathbf{A}) = \int_{\mathbb{R}^{d}} \cdots \int_{\mathbb{R}^{d}} p(\mathbf{Z}_1, \dots, \mathbf{Z}_n | \mathbf{A}) \, d\mathbf{Z}_{\{1,\dots,n\} \setminus \{v\}} $$ This integral might look intimidating, but it's simply asking: "What's the distribution of all positional representations of node $v$ in isolation?"
This is where the magic happens. For isomorphic nodes $u \simeq v$, we can prove that (Srinivasan & Ribeiro, 2020): $$p(\mathbf{Z}_u|\mathbf{A}) = p(\mathbf{Z}_v|\mathbf{A})$$
- In other words, isomorphic nodes have identical marginal distributions.
- That is, the marginal distribution is permutation-invariant.
- These marginal distributions characterize what we call structural node representations.
- Actually, a most-expressive structural node representation can be seen as sufficient information (sufficient statistics) for sampling from this distribution.
- Analogous to how the (mean, variance) vector $(\mu,\rho)$ characterizes a Normal distribution.
3.3.1 Bag of EigenvectorsΒΆ
- That is why for any two isomorphic nodes $u \simeq v$, the set of possible eigenvectors are always the same.
- That is, the distribution of eigenvectors $P(\mathcal{Z}_u | \mathbf{A},u) = P(\mathcal{Z}_v | \mathbf{A},v) \iff u \simeq v$, where $\mathcal{Z}_u$ is the collection of all possible eigenvectors of $u \in V$.
- Corollary: A most-expressive GNN can fit $P(\mathcal{Z}_u | \mathbf{A},u)$.
3.3.2. An Analogy: Bivariate NormalsΒΆ
To understand the limitations of marginal distributions, consider a bivariate normal:
$$(X_1, X_2) \sim \mathcal{N}\left(\boldsymbol{\mu}, \begin{bmatrix} 1 & \rho \\ \rho & 1 \end{bmatrix}\right)$$
The marginals are simply $X_1 \sim \mathcal{N}(\mu_1, 1)$ and $X_2 \sim \mathcal{N}(\mu_2, 1)$.
But if you observe only samples from the marginals, you cannot estimate some joint characteristics from $(X_1,X_2)$ such as the correlation parameter $\rho$. This parameter exists only in the joint ΒΈdistribution.
In graph terms, this means structural node representations (parameters of the marginals) cannot capture relational properties that exist only in the joint distribution of node properties, such as predicting if two nodes have and edge.
3.3.3. Sign and Basis Invariant Neural NetworksΒΆ
There are solutions to address the challenges of eigenvector sign and basis sensitivity, Sign and Basis Invariant Networks (SBINs) by Lim et al. (2022) as to create:
- Invariant Representations: Design neural network architectures that are invariant to sign flips and basis rotations in the eigenvectors.
- Equivariant Message Passing: Implement message passing operations that respect these invariances.
- Absolute Value and Squared Features: Using transformations of eigenvectors that are naturally invariant to sign, such as absolute values or squared features.
But a significant challenge remains: generalizing between different graphs
Positional embeddings are specific to the topology of each graph.
The "meaning" of a particular position in one graph may not translate well to another graph.
This makes inductive and transfer learning tasks particularly challenging.
For example, the second eigenvector of a line graph is very different than the second eigenvector of a star graph, making it difficult to transfer knowledge between these structures.
- On the other hand, A GNN could learn parameters to create similar node representation for line and star graphs if the predictions need to be the same.
4. Advanced Topics: Graph Foundation ModelsΒΆ
Can we pre-train a single graph model on all tasks and on different graph domains?
4.1. Holographic Node Representations (Universal Any-order Representations)ΒΆ
Beatrice Bevilacqua, Joshua Robinson, Jure Leskovec, Bruno Ribeiro, Holographic Node Representations: Pre-training Task-Agnostic Node Embeddings, ICLR 2025.
4.2. Double-Equivariance for Knowledge GraphsΒΆ
Jincheng Zhou, Yucheng Zhang, Jianfei Gao, Yangze Zhou, Bruno Ribeiro, Double Equivariance for Inductive Link Prediction for Both New Nodes and New Relation Types, arXiv 2023
Corollary (Srinivasan & Ribeiro, 2020) (informal): While most expressive structural node representations independently are not sufficient to perform link prediction, the most expressive joint structural representations (of a pair of nodes) are sufficient.
ReferencesΒΆ
Srinivasan, B., & Ribeiro, B. (2020). On the Equivalence Between Positional Node Embeddings and Structural Graph Representations.
Zhang, M., Li, P., Xia, Y., Wang, K. and Jin, L., 2021. Labeling trick: A theory of using graph neural networks for multi-node representation learning. Advances in Neural Information Processing Systems.
Xu, K., Hu, W., Leskovec, J., & Jegelka, S. (2019). How Powerful are Graph Neural Networks?
Lim, D., Robinson, J., Zhao, L., Smidt, T., Sra, S., Maron, H. and Jegelka, S., 2022. Sign and basis invariant networks for spectral graph representation learning. arXiv preprint arXiv:2202.13013.
Jincheng Zhou, Yucheng Zhang, Jianfei Gao, Yangze Zhou, Bruno Ribeiro, Double Equivariance for Inductive Link Prediction for Both New Nodes and New Relation Types, arXiv 2023.
Beatrice Bevilacqua, Joshua Robinson, Jure Leskovec, Bruno Ribeiro, Holographic Node Representations: Pre-training Task-Agnostic Node Embeddings, ICLR 2025.