🌐 TopoBench (TB) 🍩#

topobench

TopoBench (TB) is a modular Python library designed to standardize benchmarking and accelerate research in Topological Deep Learning (TDL). In particular, TB allows to train and compare the performances of all sorts of Topological Neural Networks (TNNs) across the different topological domains, where by topological domain we refer to a graph, a simplicial complex, a cellular complex, or a hypergraph.

workflow
pushpin:

Overview


TopoBench (TB) is a modular Python library designed to standardize benchmarking and accelerate research in Topological Deep Learning (TDL). In particular, TB allows to train and compare the performances of all sorts of Topological Neural Networks (TNNs) across the different topological domains, where by topological domain we refer to a graph, a simplicial complex, a cellular complex, or a hypergraph. For detailed information, please refer to the `TopoBench: A Framework for Benchmarking Topological Deep Learning <https://arxiv.org/pdf/2406.06642>`__ paper.

The main pipeline trains and evaluates a wide range of state-of-the-art TNNs and Graph Neural Networks (GNNs) (see :gear: Neural Networks) on numerous and varied datasets and benchmark tasks (see :books: Datasets ). Additionally, the library offers the ability to transform, i.e.Β lift, each dataset from one topological domain to another (see :rocket: Liftings), enabling for the first time an exhaustive inter-domain comparison of TNNs.

jigsaw:

Get Started


Create Environment#

First, clone and navigate to the TopoBench repository

git clone git@github.com:geometric-intelligence/topobench.git
cd TopoBench

Ensure conda is installed:

conda --version || echo "Conda not found! Please install it from https://docs.anaconda.com/free/miniconda/miniconda-install/"

Next, set up and activate a conda environment tb with Python 3.11.3:

conda create -n tb python=3.11.3
conda activate tb

Next, check the CUDA version of your machine:

which nvcc && nvcc --version

and ensure that it matches the CUDA version specified in the env_setup.sh file (CUDA=cu121 by default). If it does not match, update env_setup.sh accordingly by changing both the CUDA and TORCH environment variables to compatible values as specified on this website.

Next, set up the environment with the following command.

source env_setup.sh

This command installs the TopoBench library and its dependencies.

Run Training Pipeline#

Next, train the neural networks by running the following command:

python -m topobench

Customizing Experiment Configuration#

Thanks to hydra implementation, one can easily override the default experiment configuration through the command line. For instance, the model and dataset can be selected as:

python -m topobench model=cell/cwn dataset=graph/MUTAG

Remark: By default, our pipeline identifies the source and destination topological domains, and applies a default lifting between them if required.

Transforms allow you to modify your data before processing. There are two main ways to configure transforms: individual transforms and transform groups.

Configuring Individual Transforms#

When configuring a single transform, follow these steps:

  1. Choose a desired transform (e.g., a lifting transform).

  2. Identify the relative path to the transform configuration.

The folder structure for transforms is as follows:

β”œβ”€β”€ configs
β”‚ β”œβ”€β”€ data_manipulations
β”‚ β”œβ”€β”€ transforms
β”‚ β”‚ └── liftings
β”‚ β”‚   β”œβ”€β”€ graph2cell
β”‚ β”‚   β”œβ”€β”€ graph2hypergraph
β”‚ β”‚   └── graph2simplicial

To override the default transform, use the following command structure:

python -m topobench model=<model_type>/<model_name> dataset=<data_type>/<dataset_name> transforms=[<transform_path>/<transform_name>]

For example, to use the discrete_configuration_complex lifting with the cell/cwn model:

python -m topobench model=cell/cwn dataset=graph/MUTAG transforms=[liftings/graph2cell/discrete_configuration_complex]

Configuring Transform Groups#

For more complex scenarios, such as combining multiple data manipulations, use transform groups:

  1. Create a new configuration file in the configs/transforms directory (e.g., custom_example.yaml).

  2. Define the transform group in the YAML file:

defaults:
- data_manipulations@data_transform_1: identity
- data_manipulations@data_transform_2: node_degrees
- data_manipulations@data_transform_3: one_hot_node_degree_features
- liftings/graph2cell@graph2cell_lifting: cycle

Important: When composing multiple data manipulations, use the @ operator to assign unique names to each transform.

  1. Run the experiment with the custom transform group:

python -m topobench model=cell/cwn dataset=graph/ZINC transforms=custom_example

This approach allows you to create complex transform pipelines, including multiple data manipulations and liftings, in a single configuration file.

Additional Notes#

  • Automatic Lifting: By default, our pipeline identifies the source and destination topological domains and applies a default lifting between them if required.

  • Fine-Grained Configuration: The same CLI override mechanism applies when modifying finer configurations within a CONFIG GROUP. Please refer to the official hydra documentation for further details.

By mastering these configuration options, you can easily customize your experiments to suit your specific needs, from simple model and dataset selections to complex data transformation pipelines.

Experiments Reproducibility#

To reproduce Table 1 from the TopoBench: A Framework for Benchmarking Topological Deep Learning paper, please run the following command:

bash scripts/reproduce.sh

Remark: We have additionally provided a public W&B (Weights & Biases) project with logs for the corresponding runs (updated on June 11, 2024).

Tutorials#

Explore our tutorials for further details on how to add new datasets, transforms/liftings, and benchmark tasks.

Neural Networks#

We list the neural networks trained and evaluated by TopoBench, organized by the topological domain over which they operate: graph, simplicial complex, cellular complex or hypergraph. Many of these neural networks were originally implemented in `TopoModelX <pyt-team/TopoModelX>`__.

Graphs#

Simplicial complexes#

Cellular complexes#

Hypergraphs#

Combinatorial complexes#

Remark: TopoBench includes TopoTune, a comprehensive framework for easily designing new, general TDL models on any domain using any (graph) neural network as a backbone. Please check out the extended TopoTune wiki page for further details on how to leverage this framework to define and train customized topological neural network architectures.

Liftings & Transforms#

We list the liftings used in TopoBench to transform datasets. Here, a lifting refers to a function that transforms a dataset defined on a topological domain (e.g., on a graph) into the same dataset but supported on a different topological domain (e.g., on a simplicial complex).

Structural Liftings#

The structural lifting is responsible for the transformation of the underlying relationships or elements of the data. For instance, it might determine how nodes and edges in a graph are mapped into triangles and tetrahedra in a simplicial complex. This structural transformation can be further categorized into connectivity-based, where the mapping relies solely on the existing connections within the data, and feature-based, where the data’s inherent properties or features guide the new structure.

We enumerate below the structural liftings currently implemented in TopoBench; please check out the provided description links for further details.

Remark:: Most of these liftings are adaptations of winner submissions of the ICML TDL Challenge 2024 (paper | repo); see the Structural Liftings wiki for a complete list of compatible liftings.

Graph to Simplicial Complex#

Name

Type

Description

DnD Lifting

Feature-based

Wiki page

Random Latent Clique Lifting

Connectivity-based

Wiki page

Line Lifting

Connectivity-based

Wiki p age

Neighbourhood Complex Lifting

Connectivity-based

Wiki page

Graph Induced Lifting

Connectivity-based

Wiki page

Eccentricity Lifting

Connectivity-based

Wiki page

Feature‐Based Rips Complex

Both connectivity and feature-based

Wiki pag e

Clique Lifting

Connectivity-based

Wiki pag e

K-hop Lifting

Connectivity-based

Wiki p age

Graph to Cell Complex#

Name

Type

Description

Discrete Configuration Complex

Connectivity-based

Wiki page

Cycle Lifting

Connectivity-based

Wiki page

Graph to Hypergraph#

Name

Type

Description

Expander Hypergraph Lifting

Connectivity-based

Wiki page

Kernel Lifting

Both connectivity and feature-based

Wiki page

Mapper Lifting

Connectivity-based

Wiki page

Forman‐Ricci Curvature Coarse Geometry Lifting

Connectivity-based

Wiki page

KNN Lifting

Feature-based

Wiki page

K-hop Lifting

Connectivity-based

Wiki page

Pointcloud to Simplicial#

Name

Type

Description

Delaunay Lifting

Feature-based

Wiki page

Random Flag Complex

Feature-based

Wiki p age

Pointcloud to Hypergraph#

Name

Type

Description

Mixture of Gaussians MST lifting

Feature-based

Wiki page

PointNet Lifting

Feature-based

Wiki page

Voronoi Lifting

Feature-based

Wiki page

Simplicial to Combinatorial#

Name

Type

Description

Coface Lifting

Connectivity-based

Wiki page

Hypergraph to Combinatorial#

Name

Type

Description

Universal Strict Lifting

Connectivity-based

Wiki page

Feature Liftings#

Feature liftings address the transfer of data attributes or features during mapping, ensuring that the properties associated with the data elements are consistently preserved in the new representation.

Name

Description

S upported Domains

Proj ectionSum

Projects r-cell features of a graph to r+1-cell structures utilizing incidence matrices (B_{r}).

All

Co ncatenati onLifting

Concatenate r-cell features to obtain r+1-cell features.

Si mplicial

Data Transformations#

Specially useful in pre-processing steps, these are the general data manipulations currently implemented in TopoBench:

Transform

Description

OneHotDegreeFeatures

Adds the node degree as one hot encodings to the node features.

NodeFeaturesToFloat

Converts the node features of the input graph to float.

NodeDegrees

Calculates the node degrees of the input graph.

NodeDegrees

Keeps only the selected fields of the input data.

KeepOnlyConnectedComponent

Keep only the largest connected components of the input graph.

InfereRadiusConnectivity

Generates the radius connectivity of the input point cloud.

InfereKNNConnectivity

Generates the k-nearest neighbor connectivity of the input point cloud.

IdentityTransform

An identity transform that does nothing to the input data.

EqualGausFeatures

Generates equal Gaussian features for all nodes.

CalculateSimplicialCurvature

Calculates the simplicial curvature of the input graph.

books:

Datasets


Graph#

Dataset

Task

Description

Reference

Cora

Classification

Cocitation dataset.

Source

Citeseer

Classification

Cocitation dataset.

Source

Pubmed

Classification

Cocitation dataset.

Source

MUTAG

Classification

Graph-level classification.

` Source <https:/ /pubs.acs.org/d oi/abs/10.1021/ jm00106a046>`__

PROTEINS

Classification

Graph-level classification.

`Source

<https://acade

mic.oup.com/bio informatics/art icle/21/suppl_1 /i47/202991>`__

NCI1

Classification

Graph-level classification.

Source

NCI109

Classification

Graph-level classification.

Source

IMDB-BIN

Classification

Graph-level classification.

Source

IMDB-MUL

Classification

Graph-level classification.

Source

REDDIT

Classification

Graph-level classification.

`Source < https://proceed ings.neurips.cc /paper_files/pa per/2017/file/5 dd9db5e033da9c6 fb5ba83c7a7ebea 9-Paper.pdf>`__

Amazon

Classification

Heterophilic dataset.

Source

Minesweeper

Classification

Heterophilic dataset.

Source

Empire

Classification

Heterophilic dataset.

Source

Tolokers

Classification

Heterophilic dataset.

Source

US-county-demos

Regression

In turn each node attribute is used as the target label.

Source

ZINC

Regression

Graph-level regression.

Source

Simplicial#

Dataset

Task

Description

Reference

Mantra

Classification, Multi-label Classification

Predict topological attributes of manifold triangulations

So urce

Hypergraph#

Dataset

Task

Description

Reference

C ora-Cocitation

Classification

Cocitation dataset.

S ource

Cites eer-Cocitation

Classification

Cocitation dataset.

S ource

Pub Med-Cocitation

Classification

Cocitation dataset.

S ource

Cor a-Coauthorship

Classification

Cocitation dataset.

S ource

DBL P-Coauthorship

Classification

Cocitation dataset.

S ource

References#

To learn more about TopoBench, we invite you to read the paper:

@article{telyatnikov2024topobench,
      title={TopoBench: A Framework for Benchmarking Topological Deep Learning},
      author={Lev Telyatnikov and Guillermo Bernardez and Marco Montagna and Pavlo Vasylenko and Ghada Zamzmi and Mustafa Hajij and Michael T Schaub and Nina Miolane and Simone Scardapane and Theodore Papamarkou},
      year={2024},
      eprint={2406.06642},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2406.06642},
}

If you find TopoBench useful, we would appreciate if you cite us!

Additional Details#

Hierarchy of configuration files

β”œβ”€β”€ configs                   <- Hydra configs
β”‚   β”œβ”€β”€ callbacks                <- Callbacks configs
β”‚   β”œβ”€β”€ dataset                  <- Dataset configs
β”‚   β”‚   β”œβ”€β”€ graph                    <- Graph dataset configs
β”‚   β”‚   β”œβ”€β”€ hypergraph               <- Hypergraph dataset configs
β”‚   β”‚   └── simplicial               <- Simplicial dataset configs
β”‚   β”œβ”€β”€ debug                    <- Debugging configs
β”‚   β”œβ”€β”€ evaluator                <- Evaluator configs
β”‚   β”œβ”€β”€ experiment               <- Experiment configs
β”‚   β”œβ”€β”€ extras                   <- Extra utilities configs
β”‚   β”œβ”€β”€ hparams_search           <- Hyperparameter search configs
β”‚   β”œβ”€β”€ hydra                    <- Hydra configs
β”‚   β”œβ”€β”€ local                    <- Local configs
β”‚   β”œβ”€β”€ logger                   <- Logger configs
β”‚   β”œβ”€β”€ loss                     <- Loss function configs
β”‚   β”œβ”€β”€ model                    <- Model configs
β”‚   β”‚   β”œβ”€β”€ cell                     <- Cell model configs
β”‚   β”‚   β”œβ”€β”€ graph                    <- Graph model configs
β”‚   β”‚   β”œβ”€β”€ hypergraph               <- Hypergraph model configs
β”‚   β”‚   └── simplicial               <- Simplicial model configs
β”‚   β”œβ”€β”€ optimizer                <- Optimizer configs
β”‚   β”œβ”€β”€ paths                    <- Project paths configs
β”‚   β”œβ”€β”€ scheduler                <- Scheduler configs
β”‚   β”œβ”€β”€ trainer                  <- Trainer configs
β”‚   β”œβ”€β”€ transforms               <- Data transformation configs
β”‚   β”‚   β”œβ”€β”€ data_manipulations       <- Data manipulation transforms
β”‚   β”‚   β”œβ”€β”€ dataset_defaults         <- Default dataset transforms
β”‚   β”‚   β”œβ”€β”€ feature_liftings         <- Feature lifting transforms
β”‚   β”‚   └── liftings                 <- Lifting transforms
β”‚   β”‚       β”œβ”€β”€ graph2cell               <- Graph to cell lifting transforms
β”‚   β”‚       β”œβ”€β”€ graph2hypergraph         <- Graph to hypergraph lifting transforms
β”‚   β”‚       β”œβ”€β”€ graph2simplicial         <- Graph to simplicial lifting transforms
β”‚   β”‚       β”œβ”€β”€ graph2cell_default.yaml  <- Default graph to cell lifting config
β”‚   β”‚       β”œβ”€β”€ graph2hypergraph_default.yaml <- Default graph to hypergraph lifting config
β”‚   β”‚       β”œβ”€β”€ graph2simplicial_default.yaml <- Default graph to simplicial lifting config
β”‚   β”‚       β”œβ”€β”€ no_lifting.yaml           <- No lifting config
β”‚   β”‚       β”œβ”€β”€ custom_example.yaml       <- Custom example transform config
β”‚   β”‚       └── no_transform.yaml         <- No transform config
β”‚   β”œβ”€β”€ wandb_sweep              <- Weights & Biases sweep configs
β”‚   β”‚
β”‚   β”œβ”€β”€ __init__.py              <- Init file for configs module
β”‚   └── run.yaml               <- Main config for training

More information regarding Topological Deep Learning

Topological Graph Signal Compression

Architectures of Topological Deep Learning: A Survey on Topological Neural Networks

TopoX: a suite of Python packages for machine learning on topological domains


πŸ“’ Get in Touch!#

We are always open to collaborations and discussions on TDL research.
Feel free to reach out via email if you want to collaborate, do your thesis with our team, or open a discussion for various opportunities.
πŸ“§ Contact Email: topological.intelligence@gmail.com
▢️ YouTube Channel: Topological Intelligence