topobench.transforms.data_manipulations.all_encodings module#
Combined Encodings Transform (FEs + PSEs).
- class BaseTransform#
Bases:
ABCAn abstract base class for writing transforms.
Transforms are a general way to modify and customize
DataorHeteroDataobjects, either by implicitly passing them as an argument to aDataset, or by applying them explicitly to individualDataorHeteroDataobjects:import torch_geometric.transforms as T from torch_geometric.datasets import TUDataset transform = T.Compose([T.ToUndirected(), T.AddSelfLoops()]) dataset = TUDataset(path, name='MUTAG', transform=transform) data = dataset[0] # Implicitly transform data on every access. data = TUDataset(path, name='MUTAG')[0] data = transform(data) # Explicitly transform data.
- abstractmethod forward(data)#
- class CombinedEncodings(encodings, parameters=None, **kwargs)#
Bases:
BaseTransformCombined Encodings transform.
Applies both Feature Encodings (FEs) and Positional/Structural Encodings (PSEs) to a graph. FEs are applied first since they use
data.xas input, while PSEs only use graph structure.- Supported Feature Encodings (FEs):
“HKFE”: Heat Kernel Feature Encoding
“KHopFE”: K-hop Feature Encoding
“SheafConnLapPE”: Sheaf Connection Laplacian Positional Encoding
- Supported Positional/Structural Encodings (PSEs):
“LapPE”: Laplacian Positional Encoding
“RWSE”: Random Walk Structural Encoding
“ElectrostaticPE”: Electrostatic Positional Encoding
“HKdiagSE”: Heat Kernel Diagonal Structural Encoding
- Parameters:
- encodingslist of str
List of encodings to apply. Can include any mix of FEs and PSEs. FEs will always be applied before PSEs regardless of order in list.
- parametersdict, optional
Parameters for each encoding, keyed by encoding name.
- **kwargsdict, optional
Additional keyword arguments.
- __init__(encodings, parameters=None, **kwargs)#
- forward(data)#
Apply the transform to the input data.
FEs are applied first (they use data.x as input), then PSEs (they only use graph structure).
- Parameters:
- datatorch_geometric.data.Data
The input data.
- Returns:
- torch_geometric.data.Data
The transformed data with added encodings.
- class Data(x=None, edge_index=None, edge_attr=None, y=None, pos=None, time=None, **kwargs)#
Bases:
BaseData,FeatureStore,GraphStoreA data object describing a homogeneous graph. The data object can hold node-level, link-level and graph-level attributes. In general,
Datatries to mimic the behavior of a regular :python:`Python` dictionary. In addition, it provides useful functionality for analyzing graph structures, and provides basic PyTorch tensor functionalities. See here for the accompanying tutorial.from torch_geometric.data import Data data = Data(x=x, edge_index=edge_index, ...) # Add additional arguments to `data`: data.train_idx = torch.tensor([...], dtype=torch.long) data.test_mask = torch.tensor([...], dtype=torch.bool) # Analyzing the graph structure: data.num_nodes >>> 23 data.is_directed() >>> False # PyTorch tensor functionality: data = data.pin_memory() data = data.to('cuda:0', non_blocking=True)
- Parameters:
x (torch.Tensor, optional) – Node feature matrix with shape
[num_nodes, num_node_features]. (default:None)edge_index (LongTensor, optional) – Graph connectivity in COO format with shape
[2, num_edges]. (default:None)edge_attr (torch.Tensor, optional) – Edge feature matrix with shape
[num_edges, num_edge_features]. (default:None)y (torch.Tensor, optional) – Graph-level or node-level ground-truth labels with arbitrary shape. (default:
None)pos (torch.Tensor, optional) – Node position matrix with shape
[num_nodes, num_dimensions]. (default:None)time (torch.Tensor, optional) – The timestamps for each event with shape
[num_edges]or[num_nodes]. (default:None)**kwargs (optional) – Additional attributes.
- classmethod from_dict(mapping)#
Creates a
Dataobject from a dictionary.
- __init__(x=None, edge_index=None, edge_attr=None, y=None, pos=None, time=None, **kwargs)#
- connected_components()#
Extracts connected components of the graph using a union-find algorithm. The components are returned as a list of
Dataobjects, where each object represents a connected component of the graph.data = Data() data.x = torch.tensor([[1.0], [2.0], [3.0], [4.0]]) data.y = torch.tensor([[1.1], [2.1], [3.1], [4.1]]) data.edge_index = torch.tensor( [[0, 1, 2, 3], [1, 0, 3, 2]], dtype=torch.long ) components = data.connected_components() print(len(components)) >>> 2 print(components[0].x) >>> Data(x=[2, 1], y=[2, 1], edge_index=[2, 2])
- Returns:
A list of disconnected components.
- Return type:
List[Data]
- debug()#
- edge_subgraph(subset)#
Returns the induced subgraph given by the edge indices
subset. Will currently preserve all the nodes in the graph, even if they are isolated after subgraph computation.- Parameters:
subset (LongTensor or BoolTensor) – The edges to keep.
- get_all_edge_attrs()#
Returns all registered edge attributes.
- get_all_tensor_attrs()#
Obtains all feature attributes stored in Data.
- stores_as(data)#
- subgraph(subset)#
Returns the induced subgraph given by the node indices
subset.- Parameters:
subset (LongTensor or BoolTensor) – The nodes to keep.
- to_dict()#
Returns a dictionary of stored key/value pairs.
- to_heterogeneous(node_type=None, edge_type=None, node_type_names=None, edge_type_names=None)#
Converts a
Dataobject to a heterogeneousHeteroDataobject. For this, node and edge attributes are splitted according to the node-level and edge-level vectorsnode_typeandedge_type, respectively.node_type_namesandedge_type_namescan be used to give meaningful node and edge type names, respectively. That is, the node_type0is given bynode_type_names[0]. If theDataobject was constructed viato_homogeneous(), the object can be reconstructed without any need to pass in additional arguments.- Parameters:
node_type (torch.Tensor, optional) – A node-level vector denoting the type of each node. (default:
None)edge_type (torch.Tensor, optional) – An edge-level vector denoting the type of each edge. (default:
None)node_type_names (List[str], optional) – The names of node types. (default:
None)edge_type_names (List[Tuple[str, str, str]], optional) – The names of edge types. (default:
None)
- to_namedtuple()#
Returns a
NamedTupleof stored key/value pairs.
- update(data)#
Updates the data object with the elements from another data object. Added elements will override existing ones (in case of duplicates).
- validate(raise_on_error=True)#
Validates the correctness of the data.
- property num_features: int#
Returns the number of features per node in the graph. Alias for
num_node_features.
- property num_nodes: int | None#
Returns the number of nodes in the graph.
Note
The number of nodes in the data object is automatically inferred in case node-level attributes are present, e.g.,
data.x. In some cases, however, a graph may only be given without any node-level attributes. :pyg:`PyG` then guesses the number of nodes according toedge_index.max().item() + 1. However, in case there exists isolated nodes, this number does not have to be correct which can result in unexpected behavior. Thus, we recommend to set the number of nodes in your data object explicitly viadata.num_nodes = .... You will be given a warning that requests you to do so.
- class SelectDestinationEncodings(encodings, **kwargs)#
Bases:
BaseTransformSelect destination node encodings from expanded graph data.
Used in interrank message passing where we expand the graph to include both source and destination nodes, compute encodings, then select only the encodings for destination nodes.
- Parameters:
- encodingslist of str
List of encoding names to select (e.g., [‘HKFE’, ‘LapPE’]).
- **kwargsdict, optional
Additional keyword arguments.
- __init__(encodings, **kwargs)#
- forward(data, n_dst_nodes)#
Select encodings for destination nodes only.
- Parameters:
- datatorch_geometric.data.Data
The input data with encodings computed on expanded graph.
- n_dst_nodesint
Number of destination nodes (first n_dst_nodes rows to keep).
- Returns:
- torch_geometric.data.Data
Data with encodings selected for destination nodes only.