topobench.transforms.data_manipulations.group_homophily module#

A transform that canculates group combinatorial homophily of the input hypergraph.

class GroupCombinatorialHomophily(**kwargs)#

Bases: BaseTransform

Calculates group combinatorial homophily of the input hypergraph.

This transformation implements the methodology from the paper: “Combinatorial Characterizations and Impossibilities for Higher-order Homophily”. It computes homophily metrics for hypergraphs by analyzing the relationship between node labels within hyperedges.

Parameters:
**kwargsdict, optional

Additional parameters for the transform. - top_k : int, default=3

Number of top hyperedge cardinalities to analyze.

Attributes:
typestr

Identifier for the transform type.

top_kint

Number of top hyperedge cardinalities to analyze.

__init__(**kwargs)#
calculate_D_matrix(H, labels, he_cardinalities, unique_labels, class_node_idxs)#

Calculate the degree matrices D and D_t for the hypergraph.

Parameters:
Htorch.Tensor

Dense incidence matrix of the hypergraph.

labelstorch.Tensor

Node labels.

he_cardinalitiestorch.Tensor

Cardinality of each hyperedge.

unique_labelsdict

Dictionary mapping labels to their counts.

class_node_idxsdict

Dictionary mapping labels to node indices.

Returns:
tuple[torch.Tensor, torch.Tensor]
  • D_t_class : Type-t degree distribution matrix for each class

  • D : Degree matrix counting same-label nodes in hyperedges

calculate_affinity_score(n_nodes, X_mod, t, k)#

Calculate affinity score.

Parameters:
n_nodesint

Total number of nodes.

X_modint

Total number of nodes in a class.

tint

Type-t degree.

kint

Max hyperedge cardinality.

Returns:
torch.Tensor

The affinity matrix.

calculate_baseline_matrix(he_cardinalities, unique_labels, class_node_idxs, count_labels, n_nodes)#

Calculate the baseline affinity matrix for comparison.

Parameters:
he_cardinalitiestorch.Tensor

Cardinality of each hyperedge.

unique_labelsdict

Dictionary mapping labels to their counts.

class_node_idxsdict

Dictionary mapping labels to node indices.

count_labelstorch.Tensor

Count of nodes for each label.

n_nodesint

Total number of nodes in the hypergraph.

Returns:
torch.Tensor

Baseline matrix containing expected affinity scores for each class and degree type.

forward(data)#

Apply the transform to the input data.

Parameters:
datatorch_geometric.data.Data

The input data.

Returns:
torch_geometric.data.Data

The transformed data.

comb(N, k, *, exact=False, repetition=False)#

The number of combinations of N things taken k at a time.

This is often expressed as “N choose k”.

Parameters:
Nint, ndarray

Number of things.

kint, ndarray

Number of elements taken.

exactbool, optional

For integers, if exact is False, then floating point precision is used, otherwise the result is computed exactly.

repetitionbool, optional

If repetition is True, then the number of combinations with repetition is computed.

Returns:
valint, float, ndarray

The total number of combinations.

See also

binom

Binomial coefficient considered as a function of two real variables.

Notes

  • Array arguments accepted only for exact=False case.

  • If N < 0, or k < 0, then 0 is returned.

  • If k > N and repetition=False, then 0 is returned.

Examples

>>> import numpy as np
>>> from scipy.special import comb
>>> k = np.array([3, 4])
>>> n = np.array([10, 10])
>>> comb(n, k, exact=False)
array([ 120.,  210.])
>>> comb(10, 3, exact=True)
120
>>> comb(10, 3, exact=True, repetition=True)
220