π TopoBench (TB) π©#

TopoBench (TB) is a modular Python library designed to standardize benchmarking and accelerate research in Topological Deep Learning (TDL). In particular, TB allows to train and compare the performances of all sorts of Topological Neural Networks (TNNs) across the different topological domains, where by topological domain we refer to a graph, a simplicial complex, a cellular complex, or a hypergraph.

- pushpin:
Overview
TopoBench
(TB) is a modular Python library designed to standardize
benchmarking and accelerate research in Topological Deep Learning (TDL).
In particular, TB allows to train and compare the performances of all
sorts of Topological Neural Networks (TNNs) across the different
topological domains, where by topological domain we refer to a graph,
a simplicial complex, a cellular complex, or a hypergraph. For detailed
information, please refer to the
`TopoBench: A Framework for Benchmarking Topological Deep Learning
<https://arxiv.org/pdf/2406.06642>`__
paper.
The main pipeline trains and evaluates a wide range of state-of-the-art TNNs and Graph Neural Networks (GNNs) (see :gear: Neural Networks) on numerous and varied datasets and benchmark tasks (see :books: Datasets ). Additionally, the library offers the ability to transform, i.e.Β lift, each dataset from one topological domain to another (see :rocket: Liftings), enabling for the first time an exhaustive inter-domain comparison of TNNs.
- jigsaw:
Get Started
Create Environment#
First, clone and navigate to the TopoBench
repository
git clone git@github.com:geometric-intelligence/topobench.git
cd TopoBench
Ensure conda
is installed:
conda --version || echo "Conda not found! Please install it from https://docs.anaconda.com/free/miniconda/miniconda-install/"
Next, set up and activate a conda environment tb
with Python 3.11.3:
conda create -n tb python=3.11.3
conda activate tb
Next, check the CUDA version of your machine:
which nvcc && nvcc --version
and ensure that it matches the CUDA version specified in the
env_setup.sh
file (CUDA=cu121
by default). If it does not match,
update env_setup.sh
accordingly by changing both the CUDA
and
TORCH
environment variables to compatible values as specified on
this website.
Next, set up the environment with the following command.
source env_setup.sh
This command installs the TopoBench
library and its dependencies.
Run Training Pipeline#
Next, train the neural networks by running the following command:
python -m topobench
Customizing Experiment Configuration#
Thanks to hydra
implementation, one can easily override the default
experiment configuration through the command line. For instance, the
model and dataset can be selected as:
python -m topobench model=cell/cwn dataset=graph/MUTAG
Remark: By default, our pipeline identifies the source and destination topological domains, and applies a default lifting between them if required.
Transforms allow you to modify your data before processing. There are two main ways to configure transforms: individual transforms and transform groups.
Configuring Individual Transforms#
When configuring a single transform, follow these steps:
Choose a desired transform (e.g., a lifting transform).
Identify the relative path to the transform configuration.
The folder structure for transforms is as follows:
βββ configs
β βββ data_manipulations
β βββ transforms
β β βββ liftings
β β βββ graph2cell
β β βββ graph2hypergraph
β β βββ graph2simplicial
To override the default transform, use the following command structure:
python -m topobench model=<model_type>/<model_name> dataset=<data_type>/<dataset_name> transforms=[<transform_path>/<transform_name>]
For example, to use the discrete_configuration_complex
lifting with
the cell/cwn
model:
python -m topobench model=cell/cwn dataset=graph/MUTAG transforms=[liftings/graph2cell/discrete_configuration_complex]
Configuring Transform Groups#
For more complex scenarios, such as combining multiple data manipulations, use transform groups:
Create a new configuration file in the
configs/transforms
directory (e.g.,custom_example.yaml
).Define the transform group in the YAML file:
defaults:
- data_manipulations@data_transform_1: identity
- data_manipulations@data_transform_2: node_degrees
- data_manipulations@data_transform_3: one_hot_node_degree_features
- liftings/graph2cell@graph2cell_lifting: cycle
Important: When composing multiple data manipulations, use the @
operator to assign unique names to each transform.
Run the experiment with the custom transform group:
python -m topobench model=cell/cwn dataset=graph/ZINC transforms=custom_example
This approach allows you to create complex transform pipelines, including multiple data manipulations and liftings, in a single configuration file.
Additional Notes#
Automatic Lifting: By default, our pipeline identifies the source and destination topological domains and applies a default lifting between them if required.
Fine-Grained Configuration: The same CLI override mechanism applies when modifying finer configurations within a
CONFIG GROUP
. Please refer to the official hydra documentation for further details.
By mastering these configuration options, you can easily customize your experiments to suit your specific needs, from simple model and dataset selections to complex data transformation pipelines.
Experiments Reproducibility#
To reproduce Table 1 from the TopoBench: A Framework for Benchmarking Topological Deep Learning paper, please run the following command:
bash scripts/reproduce.sh
Remark: We have additionally provided a public W&B (Weights & Biases) project with logs for the corresponding runs (updated on June 11, 2024).
Tutorials#
Explore our tutorials for further details on how to add new datasets, transforms/liftings, and benchmark tasks.
Neural Networks#
We list the neural networks trained and evaluated by TopoBench
,
organized by the topological domain over which they operate: graph,
simplicial complex, cellular complex or hypergraph. Many of these neural
networks were originally implemented in
`TopoModelX
<pyt-team/TopoModelX>`__.
Graphs#
Simplicial complexes#
Cellular complexes#
Model |
Reference |
---|---|
CAN |
|
CCCN |
Inspired by A learning algorithm for computational connected cellular network, implementation adapted from Generalized Simplicial Attention Neural Networks |
CXN |
|
CWN |
Hypergraphs#
Combinatorial complexes#
Model |
Reference |
---|---|
GCCN |
TopoTune: A Framework for Generalized Combinatorial Complex Neural Networks |
Remark: TopoBench includes TopoTune, a comprehensive framework for easily designing new, general TDL models on any domain using any (graph) neural network as a backbone. Please check out the extended TopoTune wiki page for further details on how to leverage this framework to define and train customized topological neural network architectures.
Liftings & Transforms#
We list the liftings used in TopoBench
to transform datasets. Here,
a lifting refers to a function that transforms a dataset defined on a
topological domain (e.g., on a graph) into the same dataset but
supported on a different topological domain (e.g., on a simplicial
complex).
Structural Liftings#
The structural lifting is responsible for the transformation of the underlying relationships or elements of the data. For instance, it might determine how nodes and edges in a graph are mapped into triangles and tetrahedra in a simplicial complex. This structural transformation can be further categorized into connectivity-based, where the mapping relies solely on the existing connections within the data, and feature-based, where the dataβs inherent properties or features guide the new structure.
We enumerate below the structural liftings currently implemented in
TopoBench
; please check out the provided description links for
further details.
Remark:: Most of these liftings are adaptations of winner submissions of the ICML TDL Challenge 2024 (paper | repo); see the Structural Liftings wiki for a complete list of compatible liftings.
Graph to Simplicial Complex#
Name |
Type |
Description |
---|---|---|
DnD Lifting |
Feature-based |
|
Random Latent Clique Lifting |
Connectivity-based |
|
Line Lifting |
Connectivity-based |
|
Neighbourhood Complex Lifting |
Connectivity-based |
|
Graph Induced Lifting |
Connectivity-based |
|
Eccentricity Lifting |
Connectivity-based |
|
FeatureβBased Rips Complex |
Both connectivity and feature-based |
|
Clique Lifting |
Connectivity-based |
|
K-hop Lifting |
Connectivity-based |
Graph to Cell Complex#
Graph to Hypergraph#
Name |
Type |
Description |
---|---|---|
Expander Hypergraph Lifting |
Connectivity-based |
|
Kernel Lifting |
Both connectivity and feature-based |
|
Mapper Lifting |
Connectivity-based |
|
FormanβRicci Curvature Coarse Geometry Lifting |
Connectivity-based |
|
KNN Lifting |
Feature-based |
|
K-hop Lifting |
Connectivity-based |
Pointcloud to Simplicial#
Name |
Type |
Description |
---|---|---|
Delaunay Lifting |
Feature-based |
|
Random Flag Complex |
Feature-based |
Pointcloud to Hypergraph#
Simplicial to Combinatorial#
Name |
Type |
Description |
---|---|---|
Coface Lifting |
Connectivity-based |
Hypergraph to Combinatorial#
Name |
Type |
Description |
---|---|---|
Universal Strict Lifting |
Connectivity-based |
Feature Liftings#
Feature liftings address the transfer of data attributes or features during mapping, ensuring that the properties associated with the data elements are consistently preserved in the new representation.
Name |
Description |
S upported Domains |
---|---|---|
Proj ectionSum |
Projects r-cell features of a graph to r+1-cell structures utilizing incidence matrices (B_{r}). |
All |
Co ncatenati onLifting |
Concatenate r-cell features to obtain r+1-cell features. |
Si mplicial |
Data Transformations#
Specially useful in pre-processing steps, these are the general data
manipulations currently implemented in TopoBench
:
Transform |
Description |
---|---|
OneHotDegreeFeatures |
Adds the node degree as one hot encodings to the node features. |
NodeFeaturesToFloat |
Converts the node features of the input graph to float. |
NodeDegrees |
Calculates the node degrees of the input graph. |
NodeDegrees |
Keeps only the selected fields of the input data. |
KeepOnlyConnectedComponent |
Keep only the largest connected components of the input graph. |
InfereRadiusConnectivity |
Generates the radius connectivity of the input point cloud. |
InfereKNNConnectivity |
Generates the k-nearest neighbor connectivity of the input point cloud. |
IdentityTransform |
An identity transform that does nothing to the input data. |
EqualGausFeatures |
Generates equal Gaussian features for all nodes. |
CalculateSimplicialCurvature |
Calculates the simplicial curvature of the input graph. |
- books:
Datasets
Graph#
Dataset |
Task |
Description |
Reference |
---|---|---|---|
Cora |
Classification |
Cocitation dataset. |
|
Citeseer |
Classification |
Cocitation dataset. |
|
Pubmed |
Classification |
Cocitation dataset. |
|
MUTAG |
Classification |
Graph-level classification. |
` Source <https:/ /pubs.acs.org/d oi/abs/10.1021/ jm00106a046>`__ |
PROTEINS |
Classification |
Graph-level classification. |
mic.oup.com/bio informatics/art icle/21/suppl_1 /i47/202991>`__ |
NCI1 |
Classification |
Graph-level classification. |
|
NCI109 |
Classification |
Graph-level classification. |
|
IMDB-BIN |
Classification |
Graph-level classification. |
|
IMDB-MUL |
Classification |
Graph-level classification. |
|
Classification |
Graph-level classification. |
||
Amazon |
Classification |
Heterophilic dataset. |
|
Minesweeper |
Classification |
Heterophilic dataset. |
|
Empire |
Classification |
Heterophilic dataset. |
|
Tolokers |
Classification |
Heterophilic dataset. |
|
US-county-demos |
Regression |
In turn each node attribute is used as the target label. |
|
ZINC |
Regression |
Graph-level regression. |
Simplicial#
Dataset |
Task |
Description |
Reference |
---|---|---|---|
Mantra |
Classification, Multi-label Classification |
Predict topological attributes of manifold triangulations |
Hypergraph#
Dataset |
Task |
Description |
Reference |
---|---|---|---|
C ora-Cocitation |
Classification |
Cocitation dataset. |
|
Cites eer-Cocitation |
Classification |
Cocitation dataset. |
|
Pub Med-Cocitation |
Classification |
Cocitation dataset. |
|
Cor a-Coauthorship |
Classification |
Cocitation dataset. |
|
DBL P-Coauthorship |
Classification |
Cocitation dataset. |
References#
To learn more about TopoBench
, we invite you to read the paper:
@article{telyatnikov2024topobench,
title={TopoBench: A Framework for Benchmarking Topological Deep Learning},
author={Lev Telyatnikov and Guillermo Bernardez and Marco Montagna and Pavlo Vasylenko and Ghada Zamzmi and Mustafa Hajij and Michael T Schaub and Nina Miolane and Simone Scardapane and Theodore Papamarkou},
year={2024},
eprint={2406.06642},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2406.06642},
}
If you find TopoBench
useful, we would appreciate if you cite us!
Additional Details#
Hierarchy of configuration files
βββ configs <- Hydra configs
β βββ callbacks <- Callbacks configs
β βββ dataset <- Dataset configs
β β βββ graph <- Graph dataset configs
β β βββ hypergraph <- Hypergraph dataset configs
β β βββ simplicial <- Simplicial dataset configs
β βββ debug <- Debugging configs
β βββ evaluator <- Evaluator configs
β βββ experiment <- Experiment configs
β βββ extras <- Extra utilities configs
β βββ hparams_search <- Hyperparameter search configs
β βββ hydra <- Hydra configs
β βββ local <- Local configs
β βββ logger <- Logger configs
β βββ loss <- Loss function configs
β βββ model <- Model configs
β β βββ cell <- Cell model configs
β β βββ graph <- Graph model configs
β β βββ hypergraph <- Hypergraph model configs
β β βββ simplicial <- Simplicial model configs
β βββ optimizer <- Optimizer configs
β βββ paths <- Project paths configs
β βββ scheduler <- Scheduler configs
β βββ trainer <- Trainer configs
β βββ transforms <- Data transformation configs
β β βββ data_manipulations <- Data manipulation transforms
β β βββ dataset_defaults <- Default dataset transforms
β β βββ feature_liftings <- Feature lifting transforms
β β βββ liftings <- Lifting transforms
β β βββ graph2cell <- Graph to cell lifting transforms
β β βββ graph2hypergraph <- Graph to hypergraph lifting transforms
β β βββ graph2simplicial <- Graph to simplicial lifting transforms
β β βββ graph2cell_default.yaml <- Default graph to cell lifting config
β β βββ graph2hypergraph_default.yaml <- Default graph to hypergraph lifting config
β β βββ graph2simplicial_default.yaml <- Default graph to simplicial lifting config
β β βββ no_lifting.yaml <- No lifting config
β β βββ custom_example.yaml <- Custom example transform config
β β βββ no_transform.yaml <- No transform config
β βββ wandb_sweep <- Weights & Biases sweep configs
β β
β βββ __init__.py <- Init file for configs module
β βββ run.yaml <- Main config for training
More information regarding Topological Deep Learning
Topological Graph Signal Compression
Architectures of Topological Deep Learning: A Survey on Topological Neural Networks
TopoX: a suite of Python packages for machine learning on topological domains