topobench.transforms.data_manipulations.group_homophily module#

A transform that canculates group combinatorial homophily of the input hypergraph.

class GroupCombinatorialHomophily(**kwargs)#

Bases: BaseTransform

Calculates group combinatorial homophily of the input hypergraph.

This transformation implements the methodology from the paper: “Combinatorial Characterizations and Impossibilities for Higher-order Homophily”. It computes homophily metrics for hypergraphs by analyzing the relationship between node labels within hyperedges.

Parameters:

**kwargsdict, optional: Additional parameters for the transform. - top_k : int, default=3

Number of top hyperedge cardinalities to analyze.

Attributes:

typestr: Identifier for the transform type.
top_kint: Number of top hyperedge cardinalities to analyze.

__init__(**kwargs)#

calculate_D_matrix(H, labels, he_cardinalities, unique_labels, class_node_idxs)#

Calculate the degree matrices D and D_t for the hypergraph.

Parameters:

Htorch.Tensor: Dense incidence matrix of the hypergraph.
labelstorch.Tensor: Node labels.
he_cardinalitiestorch.Tensor: Cardinality of each hyperedge.
unique_labelsdict: Dictionary mapping labels to their counts.
class_node_idxsdict: Dictionary mapping labels to node indices.

Returns:

tuple[torch.Tensor, torch.Tensor]

D_t_class : Type-t degree distribution matrix for each class
D : Degree matrix counting same-label nodes in hyperedges

calculate_affinity_score(n_nodes, X_mod, t, k)#

Calculate affinity score.

Parameters:

n_nodesint: Total number of nodes.
X_modint: Total number of nodes in a class.
tint: Type-t degree.
kint: Max hyperedge cardinality.

Returns:

torch.Tensor: The affinity matrix.

calculate_baseline_matrix(he_cardinalities, unique_labels, class_node_idxs, count_labels, n_nodes)#

Calculate the baseline affinity matrix for comparison.

Parameters:

he_cardinalitiestorch.Tensor: Cardinality of each hyperedge.
unique_labelsdict: Dictionary mapping labels to their counts.
class_node_idxsdict: Dictionary mapping labels to node indices.
count_labelstorch.Tensor: Count of nodes for each label.
n_nodesint: Total number of nodes in the hypergraph.

Returns:

torch.Tensor: Baseline matrix containing expected affinity scores for each class and degree type.

forward(data)#

Apply the transform to the input data.

Parameters:

datatorch_geometric.data.Data: The input data.

Returns:

torch_geometric.data.Data: The transformed data.

comb(N, k, *, exact=False, repetition=False)#

The number of combinations of N things taken k at a time.

This is often expressed as “N choose k”.

Parameters:

Nint, ndarray: Number of things.
kint, ndarray: Number of elements taken.
exactbool, optional: For integers, if exact is False, then floating point precision is used, otherwise the result is computed exactly.
repetitionbool, optional: If repetition is True, then the number of combinations with repetition is computed.

Returns:

valint, float, ndarray: The total number of combinations.