topobench.transforms.data_manipulations package#

Data manipulations module with automated exports.

class AddGPSEInformation(**kwargs)#

Bases: BaseTransform

A transform that uses PyG 2.7’s pretrained GPSE to add positional and structural information to the graph.

Parameters:
**kwargsoptional

Parameters for the transform.

__init__(**kwargs)#
aggregate_inter_nbhd(x_out_per_route)#

Aggregate the outputs of the GNN for each rank.

While the GNN takes care of intra-nbhd aggregation, this will take care of inter-nbhd aggregation. Default: sum.

Parameters:
x_out_per_routedict

The outputs of the GNN for each route.

Returns:
dict

The aggregated outputs of the GNN for each rank.

forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

forward_interank(src_rank, dst_rank, nbhd_cache, data)#

Forward for cells where src_rank!=dst_rank.

Parameters:
src_rankint

Source rank of the transmitting cell.

dst_rankint

Destinatino rank of the transmitting cell.

nbhd_cachedict

Cache of the neighbourhood information.

datatoch_geometric.data.Data

The input data.

Returns:
data

The data object with messages passed.

forward_intrarank(src_rank, route_index, data)#

Forward for cells where src_rank==dst_rank.

Parameters:
src_rankint

Source rank of the transmitting cell.

route_indexint

The index of this particular message passing route.

datatorch_geometric.data.Data

The input data.

Returns:
data

The data object with messages passed.

get_nbhd_cache(params)#

Cache the nbhd information into a dict for the complex at hand.

Parameters:
paramsdict

The parameters of the batch, containing the complex.

Returns:
dict

The neighborhood cache.

interrank_boundary_index(boundary_index, n_dst_nodes)#

Recover lifted graph.

Edge-to-node boundary relationships of a graph with n_nodes and n_edges can be represented as up-adjacency node relations. There are n_nodes+n_edges nodes in this lifted graph. Desgiend to work for regular (edge-to-node and face-to-edge) boundary relationships.

Parameters:
x_srctorch.tensor

Source node features. Shape [n_src_nodes, n_features]. Should represent edge or face features.

boundary_indexlist of lists or list of tensors

List boundary_index[0] stores node ids in the boundary of edge stored in boundary_index[1]. List boundary_index[1] stores list of edges.

n_dst_nodesint

Number of destination nodes.

Returns:
edge_indexlist of lists

The edge_index[0][i] and edge_index[1][i] are the two nodes of edge i.

edge_attrtensor

Edge features are given by feature of bounding node represnting an edge. Shape [n_edges, n_features].

interrank_expand(params, src_rank, dst_rank, nbhd_cache)#

Expand the complex into an interrank Hasse graph.

Parameters:
paramsdict

The parameters of the batch, containting the complex.

src_rankint

The source rank.

dst_rankint

The destination rank.

nbhd_cachedict

The neighborhood cache containing the expanded boundary index and edge attributes.

Returns:
torch_geometric.data.Data

The expanded batch of interrank Hasse graphs for this route.

intrarank_expand(params, src_rank, nbhd)#

Expand the complex into an intrarank Hasse graph.

Parameters:
paramsdict

The parameters of the batch, containting the complex.

src_rankint

The source rank.

nbhdstr

The neighborhood to use.

Returns:
torch_geometric.data.Data

The expanded batch of intrarank Hasse graphs for this route.

class BarycentricSubdivisionTransform(**kwargs)#

Bases: BaseTransform

A transform that performs barycentric subdivision on a simplicial complex.

The barycentric subdivision of a simplicial complex K is a new simplicial complex Sd(K) where each simplex in K is replaced by a collection of simplices, resulting in a finer triangulation of the underlying space.

Parameters:
**kwargsoptional

Parameters for the base transform.

__init__(**kwargs)#
forward(data)#

Apply the barycentric subdivision to the input data.

Parameters:
datatorch_geometric.data.Data

The input data, expected to contain a simplicial complex.

Returns:
torch_geometric.data.Data

The data with a subdivided simplicial complex.

class CalculateSimplicialCurvature(**kwargs)#

Bases: BaseTransform

A transform that calculates the simplicial curvature of the input graph.

Parameters:
**kwargsoptional

Parameters for the transform.

__init__(**kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

one_cell_curvature(data)#

Calculate the one cell curvature of the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

Data with the one cell curvature.

two_cell_curvature(data)#

Calculate the two cell curvature of the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

Data with the two cell curvature.

zero_cell_curvature(data)#

Calculate the zero cell curvature of the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

Data with the zero cell curvature.

class CombinedEncodings(encodings, parameters=None, **kwargs)#

Bases: BaseTransform

Combined Encodings transform.

Applies both Feature Encodings (FEs) and Positional/Structural Encodings (PSEs) to a graph. FEs are applied first since they use data.x as input, while PSEs only use graph structure.

Supported Feature Encodings (FEs):
  • “HKFE”: Heat Kernel Feature Encoding

  • “KHopFE”: K-hop Feature Encoding

  • “SheafConnLapPE”: Sheaf Connection Laplacian Positional Encoding

Supported Positional/Structural Encodings (PSEs):
  • “LapPE”: Laplacian Positional Encoding

  • “RWSE”: Random Walk Structural Encoding

  • “ElectrostaticPE”: Electrostatic Positional Encoding

  • “HKdiagSE”: Heat Kernel Diagonal Structural Encoding

Parameters:
encodingslist of str

List of encodings to apply. Can include any mix of FEs and PSEs. FEs will always be applied before PSEs regardless of order in list.

parametersdict, optional

Parameters for each encoding, keyed by encoding name.

**kwargsdict, optional

Additional keyword arguments.

__init__(encodings, parameters=None, **kwargs)#
forward(data)#

Apply the transform to the input data.

FEs are applied first (they use data.x as input), then PSEs (they only use graph structure).

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data with added encodings.

class CombinedFEs(encodings, parameters=None, **kwargs)#

Bases: BaseTransform

Combined FEs transform.

Applies one or more pre-defined feature encoding transforms (KHopFE, HKFE, SheafConnLapPE, PPRFE) to a graph, storing their outputs and optionally concatenating them to data.x.

Parameters:
encodingslist of str

List of feature encodings to apply. Supported values are “KHopFE” for K-hop Feature Encoding, “HKFE” for Heat Kernel Feature Encoding, “SheafConnLapPE” for Sheaf Connection Laplacian Positional Encoding, and “PPRFE” for Personalized Page Rank Feature Encoding.

parametersdict, optional

Additional parameters for the encoding transforms.

**kwargsdict, optional

Additional keyword arguments.

__init__(encodings, parameters=None, **kwargs)#
forward(data)#

Apply the transform to the input data.

All encodings are computed on the original features first, then those with concat_to_x=True are concatenated at the end. This ensures each encoding sees the original features, not features modified by previous encodings.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data with added feature encodings.

class CombinedPSEs(encodings, parameters=None, preprocessor_device=None, **kwargs)#

Bases: BaseTransform

Combined PSEs transform.

Applies one or more pre-defined positional or structural encoding transforms (LapPE, RWSE) to a graph, storing their outputs and optionally concatenating them to data.x.

Parameters:
encodingslist of str

List of structural encodings to apply. Supported values are “LapPE”, “RWSE”, “ElectrostaticPE”, and “HKdiagSE”.

parametersdict, optional

Additional parameters for the encoding transforms.

preprocessor_devicestr, optional

The overarching device to use for the combined transforms (e.g., ‘cpu’, ‘cuda’). If a specific encoding specifies its own device in parameters, that will take precedence. Default is None.

**kwargsdict, optional

Additional keyword arguments.

__init__(encodings, parameters=None, preprocessor_device=None, **kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data with added structural encodings.

class DifferentGausFeatures(**kwargs)#

Bases: BaseTransform

A transform that generates different Gaussian features for all nodes.

Parameters:
**kwargsoptional

Additional arguments for the class. It should contain the following keys: - mean (float): The mean of the Gaussian distribution. - std (float): The standard deviation of the Gaussian distribution. - num_features (int): The number of features to generate, defaults to -1 where the intial feature vector shape is taken. - dimensions (list): The dimension numbers to generate features for.

__init__(**kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class DifferentGausFeaturesSANN(**kwargs)#

Bases: BaseTransform

A transform that generates different Gaussian features for all nodes.

Parameters:
**kwargsoptional

Additional arguments for the class. It should contain the following keys: - mean (float): The mean of the Gaussian distribution. - std (float): The standard deviation of the Gaussian distribution. - num_features (int): The number of features to generate, defaults to -1 where the intial feature vector shape is taken. - dimensions (list): The dimension numbers to generate features for.

__init__(**kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class DifferentZeroFeaturesSANN(**kwargs)#

Bases: BaseTransform

A transform that generates different Gaussian features for all nodes.

Parameters:
**kwargsoptional

Additional arguments for the class. It should contain the following keys: - mean (float): The mean of the Gaussian distribution. - std (float): The standard deviation of the Gaussian distribution. - num_features (int): The number of features to generate, defaults to -1 where the intial feature vector shape is taken. - dimensions (list): The dimension numbers to generate features for.

__init__(**kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class ElectrostaticPE(concat_to_x=True, eps=1e-06, method='numpy', debug=False, **kwargs)#

Bases: BaseTransform

Electrostatic Positional Encoding (ElectrostaticPE) transform.

Parameters:
concat_to_xbool, optional

If True, concatenates the encodings with existing node features. Default is True.

epsfloat, optional

Small value to avoid division by zero. Default is 1e-6.

methodstr, optional

Computation method: “numpy” (CPU NumPy) or “gpu” (GPU PyTorch). Default is “gpu”.

debugbool, optional

If True, runs both methods and compares outputs. Default is False.

**kwargsdict

Additional arguments (not used).

__init__(concat_to_x=True, eps=1e-06, method='numpy', debug=False, **kwargs)#
forward(data)#

Compute the electrostatic positional encodings for the input graph.

Parameters:
datatorch_geometric.data.Data

Input graph data object.

Returns:
torch_geometric.data.Data

Graph data object with electrostatic positional encodings added.

class EqualGausFeatures(**kwargs)#

Bases: BaseTransform

A transform that generates equal Gaussian features for all nodes.

Parameters:
**kwargsoptional

Additional arguments for the class. It should contain the following keys: - mean (float): The mean of the Gaussian distribution. - std (float): The standard deviation of the Gaussian distribution. - num_features (int): The number of features to generate.

__init__(**kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class GroupCombinatorialHomophily(**kwargs)#

Bases: BaseTransform

Calculates group combinatorial homophily of the input hypergraph.

This transformation implements the methodology from the paper: “Combinatorial Characterizations and Impossibilities for Higher-order Homophily”. It computes homophily metrics for hypergraphs by analyzing the relationship between node labels within hyperedges.

Parameters:
**kwargsdict, optional

Additional parameters for the transform. - top_k : int, default=3

Number of top hyperedge cardinalities to analyze.

Attributes:
typestr

Identifier for the transform type.

top_kint

Number of top hyperedge cardinalities to analyze.

__init__(**kwargs)#
calculate_D_matrix(H, labels, he_cardinalities, unique_labels, class_node_idxs)#

Calculate the degree matrices D and D_t for the hypergraph.

Parameters:
Htorch.Tensor

Dense incidence matrix of the hypergraph.

labelstorch.Tensor

Node labels.

he_cardinalitiestorch.Tensor

Cardinality of each hyperedge.

unique_labelsdict

Dictionary mapping labels to their counts.

class_node_idxsdict

Dictionary mapping labels to node indices.

Returns:
tuple[torch.Tensor, torch.Tensor]
  • D_t_class : Type-t degree distribution matrix for each class

  • D : Degree matrix counting same-label nodes in hyperedges

calculate_affinity_score(n_nodes, X_mod, t, k)#

Calculate affinity score.

Parameters:
n_nodesint

Total number of nodes.

X_modint

Total number of nodes in a class.

tint

Type-t degree.

kint

Max hyperedge cardinality.

Returns:
torch.Tensor

The affinity matrix.

calculate_baseline_matrix(he_cardinalities, unique_labels, class_node_idxs, count_labels, n_nodes)#

Calculate the baseline affinity matrix for comparison.

Parameters:
he_cardinalitiestorch.Tensor

Cardinality of each hyperedge.

unique_labelsdict

Dictionary mapping labels to their counts.

class_node_idxsdict

Dictionary mapping labels to node indices.

count_labelstorch.Tensor

Count of nodes for each label.

n_nodesint

Total number of nodes in the hypergraph.

Returns:
torch.Tensor

Baseline matrix containing expected affinity scores for each class and degree type.

forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class HKFE(kernel_param_HKFE, concat_to_x=True, aggregation='mean', method='approx', cheb_order=10, debug=False, **kwargs)#

Bases: BaseTransform

Heat Kernel Feature Encodings (HKFE) transform.

Parameters:
kernel_param_HKFEtuple of int

Tuple specifying the start and end diffusion times for the heat kernel.

concat_to_xbool, optional

If True, concatenates encodings with existing node features in data.x. Default is True.

aggregationstr, optional

Aggregation function to reduce over the feature dimension. Options: “mean”, “sum”, “max”, “min”. Default is “mean”.

methodstr, optional

Computation method: “exact” or “approx”. Default is “approx”.

cheb_orderint, optional

The order of the Chebyshev polynomial. Default is 10.

debugbool, optional

If True, runs both exact and approx methods, compares their outputs, and prints the timing and error metrics. Default is False.

**kwargsdict

Additional arguments (not used).

__init__(kernel_param_HKFE, concat_to_x=True, aggregation='mean', method='approx', cheb_order=10, debug=False, **kwargs)#
forward(data)#

Compute the HKFE for the input graph.

Parameters:
datatorch_geometric.data.Data

Input graph data object.

Returns:
torch_geometric.data.Data

Graph data object with HKFE added to data.x or data.HKFE.

class HKdiagSE(kernel_param_HKdiagSE, space_dim=0, include_eigenvalues=False, include_first=False, concat_to_x=True, method='fast', debug=False, **kwargs)#

Bases: BaseTransform

Heat Kernel Diagonal Structural Encoding (HKdiagSE) transform.

Parameters:
kernel_param_HKdiagSEtuple of int

Tuple specifying the start and end diffusion times for the heat kernel.

space_dimint, optional

Estimated dimensionality of the space. Used to correct the diffusion diagonal by a factor t^(space_dim/2). Default is 0 (no correction).

include_eigenvaluesbool, optional

If True, concatenates eigenvalues alongside eigenvectors. Default is False.

include_firstbool, optional

If False, removes eigenvectors corresponding to (near-)zero eigenvalues. Default is False.

concat_to_xbool, optional

If True, concatenates the encodings with existing node features. Default is True.

methodstr, optional

Computation method: “exact” (CPU NumPy + loop) or “fast” (GPU PyTorch + vectorized). Default is “fast”.

debugbool, optional

If True, runs both methods and prints error/timing metrics. Default is False.

**kwargsdict

Additional arguments (not used).

__init__(kernel_param_HKdiagSE, space_dim=0, include_eigenvalues=False, include_first=False, concat_to_x=True, method='fast', debug=False, **kwargs)#
forward(data)#

Compute the Heat Kernel Diagonal Structural Encodings for the input graph.

Parameters:
datatorch_geometric.data.Data

Input graph data object.

Returns:
torch_geometric.data.Data

Graph data object with HKdiagSE positional encodings added.

class HOPSE_PE_Information(**kwargs)#

Bases: BaseTransform

A transform that uses a positional and structural information added to the graph.

Parameters:
**kwargsoptional

Parameters for the transform.

__init__(**kwargs)#
aggregate_inter_nbhd(x_out_per_route)#

Aggregate the outputs of the GNN for each rank.

While the GNN takes care of intra-nbhd aggregation, this will take care of inter-nbhd aggregation. Default: sum.

Parameters:
x_out_per_routedict

The outputs of the GNN for each route.

Returns:
dict

The aggregated outputs of the GNN for each rank.

forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

forward_interank(src_rank, dst_rank, nbhd_cache, data)#

Forward for cells where src_rank!=dst_rank.

Parameters:
src_rankint

Source rank of the transmitting cell.

dst_rankint

Destination rank of the transmitting cell.

nbhd_cachedict

Cache of the neighbourhood information.

datatorch_geometric.data.Data

The input data.

Returns:
data

The data object with messages passed.

forward_intrarank(src_rank, route_index, data)#

Forward for cells where src_rank==dst_rank.

Parameters:
src_rankint

Source rank of the transmitting cell.

route_indexint

The index of this particular message passing route.

datatorch_geometric.data.Data

The input data.

Returns:
data

The data object with messages passed.

get_nbhd_cache(params)#

Cache the nbhd information into a dict for the complex at hand.

Parameters:
paramsdict

The parameters of the batch, containing the complex.

Returns:
dict

The neighborhood cache.

interrank_boundary_index(boundary_index, n_dst_nodes)#

Recover lifted graph.

Edge-to-node boundary relationships of a graph with n_nodes and n_edges can be represented as up-adjacency node relations. There are n_nodes+n_edges nodes in this lifted graph. Desgiend to work for regular (edge-to-node and face-to-edge) boundary relationships.

Parameters:
x_srctorch.tensor

Source node features. Shape [n_src_nodes, n_features]. Should represent edge or face features.

boundary_indexlist of lists or list of tensors

List boundary_index[0] stores node ids in the boundary of edge stored in boundary_index[1]. List boundary_index[1] stores list of edges.

n_dst_nodesint

Number of destination nodes.

Returns:
edge_indexlist of lists

The edge_index[0][i] and edge_index[1][i] are the two nodes of edge i.

edge_attrtensor

Edge features are given by feature of bounding node represnting an edge. Shape [n_edges, n_features].

interrank_expand(params, src_rank, dst_rank, nbhd_cache)#

Expand the complex into an interrank Hasse graph.

Parameters:
paramsdict

The parameters of the batch, containting the complex.

src_rankint

The source rank.

dst_rankint

The destination rank.

nbhd_cachedict

The neighborhood cache containing the expanded boundary index and edge attributes.

Returns:
torch_geometric.data.Data

The expanded batch of interrank Hasse graphs for this route.

intrarank_expand(params, src_rank, nbhd)#

Expand the complex into an intrarank Hasse graph.

Parameters:
paramsdict

The parameters of the batch, containting the complex.

src_rankint

The source rank.

nbhdstr

The neighborhood to use.

Returns:
torch_geometric.data.Data

The expanded batch of intrarank Hasse graphs for this route.

class IdentityTransform(**kwargs)#

Bases: BaseTransform

An identity transform that does nothing to the input data.

Parameters:
**kwargsoptional

Parameters for the base transform.

__init__(**kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The same data.

class InfereKNNConnectivity(**kwargs)#

Bases: BaseTransform

Transform to infer point cloud connectivity.

The transform generates the k-nearest neighbor connectivity of the input point cloud.

Parameters:
**kwargsoptional

Parameters for the base transform.

__init__(**kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class InfereRadiusConnectivity(**kwargs)#

Bases: BaseTransform

Class to infer point cloud connectivity.

The transform generates the radius connectivity of the input point cloud.

Parameters:
**kwargsoptional

Parameters for the base transform.

__init__(**kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class KHopFE(max_hop, concat_to_x=True, aggregation='mean', method='sparse', debug=False, **kwargs)#

Bases: BaseTransform

K-hop Feature Encodings (KHopFE) transform.

Parameters:
max_hopint

The maximum hop neighbourhood.

concat_to_xbool, optional

If True, concatenates the encodings with existing node features in data.x. If data.x is None, creates it. Default is True.

aggregationstr, optional

Aggregation function to reduce over the feature dimension. Options: “mean”, “sum”, “max”, “min”. Default is “mean”.

methodstr, optional

Computation method: “dense” or “sparse”. Default is “sparse”.

debugbool, optional

If True, runs both methods and prints error/timing metrics. Default is False.

**kwargsdict

Additional arguments (not used).

__init__(max_hop, concat_to_x=True, aggregation='mean', method='sparse', debug=False, **kwargs)#
forward(data)#

Compute the K-hop feature encodings for the input graph.

Parameters:
datatorch_geometric.data.Data

Input graph data object.

Returns:
torch_geometric.data.Data

Graph data object with K-hop feature encodings added.

class KeepOnlyConnectedComponent(**kwargs)#

Bases: BaseTransform

Class to keep only the largest connected components of the input graph.

Parameters:
**kwargsoptional

Parameters for the base transform.

__init__(**kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class KeepSelectedDataFields(**kwargs)#

Bases: BaseTransform

A transform that keeps only the selected fields of the input data.

Parameters:
**kwargsoptional

Parameters for the base transform.

__init__(**kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class KeepSelectedTargetIndices(**kwargs)#

Bases: BaseTransform

A transform that keeps only the selected fields of the input data.

Parameters:
**kwargsoptional

Parameters for the base transform.

__init__(**kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class LapPE(max_pe_dim, include_eigenvalues=False, include_first=False, concat_to_x=True, eps=1e-06, tolerance=0.001, method='gpu', debug=False, **kwargs)#

Bases: BaseTransform

Laplacian Positional Encoding (LapPE) transform.

Parameters:
max_pe_dimint

Maximum number of eigenvectors to use (dimensionality of the encoding).

include_eigenvaluesbool, optional

If True, concatenates eigenvalues alongside eigenvectors. Default is False.

include_firstbool, optional

If False, removes eigenvectors corresponding to (near-)zero eigenvalues. Default is False.

concat_to_xbool, optional

If True, concatenates the encodings with existing node features. Default is True.

epsfloat, optional

Small value to avoid division by zero. Default is 1e-6.

tolerancefloat, optional

Tolerance for the eigenvalue solver. Default is 0.001.

methodstr, optional

Computation method: “exact” (SciPy CPU) or “gpu” (PyTorch GPU). Default is “gpu”.

debugbool, optional

If True, runs both methods and prints error/timing metrics. Default is False.

**kwargsdict

Additional arguments (not used).

__init__(max_pe_dim, include_eigenvalues=False, include_first=False, concat_to_x=True, eps=1e-06, tolerance=0.001, method='gpu', debug=False, **kwargs)#
forward(data)#

Compute the Laplacian positional encodings for the input graph.

Parameters:
datatorch_geometric.data.Data

Input graph data object.

Returns:
torch_geometric.data.Data

Graph data object with Laplacian positional encodings added.

class MessagePassingHomophily(**kwargs)#

Bases: BaseTransform

Calculates message passing homophily of the input data.

This transformation implements the methodology from the paper: “Hypergraph Neural Networks through the Lens of Message Passing: A Common Perspective to Homophily and Architecture Design”. It computes homophily metrics for hypergraphs by analyzing the relationship between node labels within hyperedges.

Parameters:
**kwargsdict, optional

Additional parameters for the transform. - top_k : int, default=3

Number of top hyperedge cardinalities to analyze.

Attributes:
typestr

Identifier for the transform type.

top_kint

Number of top hyperedge cardinalities to analyze.

__init__(**kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class NodeDegrees(**kwargs)#

Bases: BaseTransform

A transform that calculates the node degrees of the input graph.

Parameters:
**kwargsoptional

Parameters for the base transform.

__init__(**kwargs)#
calculate_node_degrees(data, field)#

Calculate the node degrees of the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

fieldstr

The field to calculate the node degrees.

Returns:
torch_geometric.data.Data

The transformed data.

forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class NodeFeaturesToFloat(**kwargs)#

Bases: BaseTransform

A transform that converts the node features of the input graph to float.

Parameters:
**kwargsoptional

Parameters for the base transform.

__init__(**kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class OneHotDegreeFeatures(max_degree, degrees_field, features_field, cat=False, **kwargs)#

Bases: BaseTransform

Class for one hot degree features transform.

A transform that adds the node degree as one hot encodings to the node features.

Parameters:
max_degreeint

The maximum degree of the graph.

degrees_fieldstr

The field containing the node degrees.

features_fieldstr

The field containing the node features.

catbool, optional

If set to True, the one hot encodings are concatenated to the node features (default: False).

**kwargsoptional

Additional arguments for the class.

__init__(max_degree, degrees_field, features_field, cat=False, **kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class PPRFE(alpha_param_PPRFE, concat_to_x=True, aggregation='mean', self_loop=True, method='approx', appnp_K=20, debug=False, **kwargs)#

Bases: BaseTransform

Personalized Page Rank Feature Encodings (PPRFE) transform.

Parameters:
alpha_param_PPRFEtuple of float

Tuple specifying the start and end teleport probabilities (alpha values).

concat_to_xbool, optional

If True, concatenates the encodings with existing node features. Default is True.

aggregationstr, optional

Aggregation function to reduce over the feature dimension. Options: “mean”, “sum”, “max”, “min”. Default is “mean”.

self_loopbool, optional

If True, adds self-loops to the adjacency matrix. Default is True.

methodstr, optional

Computation method: “exact” or “approx”. Default is “approx”.

appnp_Kint, optional

Number of polynomial expansion terms (propagation steps) for the approx method. Higher means more global information but slower. Default is 20.

debugbool, optional

If True, runs both methods and prints error/timing metrics. Default is False.

**kwargsdict

Additional arguments (not used).

__init__(alpha_param_PPRFE, concat_to_x=True, aggregation='mean', self_loop=True, method='approx', appnp_K=20, debug=False, **kwargs)#
forward(data)#

Compute the PPR feature encodings for the input graph.

Parameters:
datatorch_geometric.data.Data

Input graph data object.

Returns:
torch_geometric.data.Data

Graph data object with PPR feature encodings added.

class PrecomputeKHopFeatures(max_hop, complex_dim, use_initial_features, **kwargs)#

Bases: BaseTransform

Class for precomputing the features of a k-hop neighbourhood features transform.

A transform that computes an aggregation of injective transformations of the k-hop neighbourhood.

Parameters:
max_hopint

The maximum hop neighbourhood.

complex_dimint

The maximum dimension of the complex to evaluate.

use_initial_featuresbool

Whether to use the initial features as the 0-hop features.

**kwargsoptional

Additional arguments for the class.

__init__(max_hop, complex_dim, use_initial_features, **kwargs)#
forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

class RWSE(max_pe_dim, concat_to_x=True, method='batched', batch_size=128, debug=False, **kwargs)#

Bases: BaseTransform

Random Walk Structural Encoding (RWSE) transform.

Parameters:
max_pe_dimint

Maximum walk length (number of RWSE dimensions).

concat_to_xbool, optional

If True, concatenates the encodings with existing node features. Default is True.

methodstr, optional

Computation method: “dense”, “sparse”, or “batched”. “dense” uses standard matrix multiplication (Memory intensive). “sparse” uses pure sparse matrix multiplication (Fastest, moderate memory). “batched” uses indicator diffusion (Memory-bounded, slightly slower). Default is “sparse”.

batch_sizeint, optional

Number of nodes to process simultaneously when using the “batched” method. Lower values use less memory but take slightly longer. Default is 2048.

debugbool, optional

If True, runs all methods, catches OOM errors, and prints a detailed timing and peak VRAM memory footprint report. Default is False.

**kwargsdict

Additional arguments (not used).

__init__(max_pe_dim, concat_to_x=True, method='batched', batch_size=128, debug=False, **kwargs)#
forward(data)#

Compute the RWSE for the input graph.

Parameters:
datatorch_geometric.data.Data

Input graph data object.

Returns:
torch_geometric.data.Data

Graph data object with RWSE added to data.x or data.RWSE.

class RenameFields(init_field_name, new_field_name, **kwargs)#

Bases: BaseTransform

A transform that renames specified fields in a torch_geometric.data.Data object.

Parameters:
init_field_namelist of str

List of original field names to be renamed.

new_field_namelist of str

List of new field names corresponding to init_field_name.

**kwargsdict, optional

Additional keyword arguments stored on the transform as self.parameters.

__init__(init_field_name, new_field_name, **kwargs)#
forward(data)#

Apply the transform to rename fields in the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The modified data with renamed fields.

class SelectDestinationEncodings(encodings, **kwargs)#

Bases: BaseTransform

Select destination node encodings from expanded graph data.

Used in interrank message passing where we expand the graph to include both source and destination nodes, compute encodings, then select only the encodings for destination nodes.

Parameters:
encodingslist of str

List of encoding names to select (e.g., [‘HKFE’, ‘LapPE’]).

**kwargsdict, optional

Additional keyword arguments.

__init__(encodings, **kwargs)#
forward(data, n_dst_nodes)#

Select encodings for destination nodes only.

Parameters:
datatorch_geometric.data.Data

The input data with encodings computed on expanded graph.

n_dst_nodesint

Number of destination nodes (first n_dst_nodes rows to keep).

Returns:
torch_geometric.data.Data

Data with encodings selected for destination nodes only.

class SelectDestinationFEs(encodings, **kwargs)#

Bases: BaseTransform

Select Destination Feature Encodings (FEs) transform.

Selects and retains only the FEs corresponding to the destination nodes of edges in data.edge_index.

Parameters:
encodingslist of str

List of encoding keys in data where the FEs are stored (e.g., ‘HKFE’, ‘KHopFE’).

**kwargsdict, optional

Additional keyword arguments.

__init__(encodings, **kwargs)#
forward(data, n_dst_nodes)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

n_dst_nodesint

Number of destination nodes.

Returns:
torch_geometric.data.Data

The transformed data with selected FEs.

class SheafConnLapPE(max_pe_dim, stalk_dim=3, include_first=False, concat_to_x=True, eps=1e-06, **kwargs)#

Bases: BaseTransform

Sheaf Connection Laplacian Positional Encoding (SheafConnLapPE) transform.

Based on “Sheaf-based Positional Encodings for Graph Neural Networks” by He, Bodnar & Liò (NeurIPS 2023 Workshop / PMLR 2024). https://openreview.net/pdf?id=ZtAabWUPu3

The Connection Laplacian generalises the standard graph Laplacian by replacing each scalar off-diagonal entry (-1) with a d×d orthogonal restriction map — a rotation encoding the geometric alignment between the local node-feature neighbourhoods of the two endpoints.

For each edge (v, u), the algorithm:

  1. Runs local PCA on the 1-hop feature neighbourhood of v and u separately, yielding orthonormal bases B_v, B_u ∈ R^{p×d} that approximate the local tangent spaces T_{x_v}M and T_{x_u}M under the manifold assumption.

  2. Solves the orthogonal Procrustes problem to find the rotation O_{vu} ∈ O(d) that best maps B_v onto B_u (closed form: SVD of B_v^T B_u).

  3. Sets the off-diagonal block L_F[v, u] = -O_{vu}.

The resulting nd×nd block matrix L_F is symmetric positive semi-definite. Its k smallest non-trivial eigenvectors (each reshaped from nd to n×d) are concatenated column-wise to form a PE of total dimension k×d per node.

On homophilic edges (similar features) O_{vu} ≈ I and the Connection Laplacian closely resembles the standard Laplacian. On heterophilic edges O_{vu} is a non-trivial rotation, introducing cross-dimensional coupling that encodes semantic disagreement — information the standard Laplacian cannot represent.

Note

Feature dimension requirement : data.x must be present and data.x.shape[1] >= stalk_dim. The method assumes that node features lie near a stalk_dim-dimensional manifold; if feature_dim < stalk_dim this assumption is violated and the PCA basis would contain zero columns, making the Procrustes rotation degenerate and breaking the PSD property of L_F. A ValueError is raised in this case.

Isolated nodes : For isolated nodes (degree 0), the diagonal block of L_F is the zero matrix, making D^{-1/2} undefined. The normalisation substitutes 1.0 for these zero diagonal entries, which is equivalent to adding a unit self-loop for numerical purposes. The eigenvector components of isolated nodes are still well-defined (the corresponding rows of L_F remain all-zero), but their PE values will reflect their position in the global spectrum rather than local connectivity.

Parameters:
max_pe_dimint

Total output PE dimension. Must be divisible by stalk_dim. Internally, the number of eigenvectors used is k = max_pe_dim // stalk_dim, so the output shape is always [num_nodes, max_pe_dim] (zero-padded if fewer eigenvectors are available).

stalk_dimint, optional

Dimension d of each stalk / restriction map. Controls the rank of the local tangent-space approximation. Default is 3, as used in the paper experiments. Must be <= feature_dim of data.x.

include_firstbool, optional

If False (default), discards eigenvectors whose eigenvalue is below eps (the trivial zero-eigenvectors / global sections of the sheaf).

concat_to_xbool, optional

If True (default), concatenates the PE with data.x. If False, stores it as data.SheafConnLapPE instead.

epsfloat, optional

Threshold below which eigenvalues are considered trivial. Default 1e-6.

**kwargs

Additional keyword arguments (unused; reserved for future extensions).

__init__(max_pe_dim, stalk_dim=3, include_first=False, concat_to_x=True, eps=1e-06, **kwargs)#
forward(data)#

Compute and attach the ConnLap PE to a graph data object.

Parameters:
dataData

Input graph. data.x must be set and data.x.shape[1] >= stalk_dim.

Returns:
Data

Graph with PE concatenated to data.x (concat_to_x=True) or stored in data.SheafConnLapPE (concat_to_x=False).

Raises:
ValueError

If data.x is None, or if feature_dim < stalk_dim.

class dotdict#

Bases: dict

Dot.notation access to dictionary attributes.

Submodules#