Tutorial¶
In this tutorial we will introduce the ASH
object that can be used to describe undirected, temporal hypergraphs with attributes on nodes.
Creating an ASH¶
Create an empty ASH with no nodes and no edges.
from ash_model import ASH
h = ASH(hedge_removal=True)
During the construction phase the hedge_removal
parameter allows to specify if the structure will allow hyperedge removal or not.
High-order Interactions: Hyperedges¶
h
can be grown by adding one interaction (i.e., an hyperedge) at a time.
Each interaction is univocally defined by an arbitrarily-large set of nodes, and has a start
and a end
timestamp, indicating its appearance and disappearance times respectively.
h.add_hyperedge({1,2,3,4}, start=0, end=1)
If nodes are not already present in the ASH, they will be automatically added when they appear in an interaction.
In the above example the interaction {1,2,3,4}
appears at time 0
and vanishes at time 1
, thus being present in both timestamps.
If the end
parameter is omitted, the hyperedge will only be active in start
.
Moreover, hyperedges are associated with a unique str
identifier of the form “en” where n is a positive number:
h.get_hyperedge_id_set()
>>> {"e1"}
Multiple interactions can be added at the same time: in such scenario all the interactions in the list will have the same appearance and disappearance times.
h.add_hyperedges([{1,2,3}, {2, 3}, {3, 1, 4}], start=1, end=2)
The same method can be used to add any ebunch
of interaction. An ebunch is any iterable container of interactions.
import networkx as nx
g = nx.barabasi_albert_graph(n=500, m=3)
h.add_hyperedges(nx.find_cliques(g), start=0)
One of the factors that make ash
faster with respect to other hypergraph libraries is the O(1)
access it provides to a node’s star.
That is to say that given a node is is possible to easily get the set of hyperedges it is contained in. It is also possible to conveniently restrict the result to hyperedges of a given size and/or active at a given timestamp:
h.get_star(1) # set of ids of the hyperedges that contain node 1
h.get_star(1, hyperedge_size=4, tid=2) # set of ids of the hyperedges that contain node 1,
# have size 4
# and are active at time 2
Node Profiles: Enrich Your Hypergraph!¶
As for hyperedges, it is also possible to add one node at a time, specifying start
, or start
and end
timestamps.
The same can be done for multiple nodes at a time, which will all have the same start
and end
timestamps:
# one node at a time
h.add_node(0, start=0)
h.add_node(1, start=0, end=1)
# multiple nodes, same time window
h.add_nodes([1,2,3], start=0)
h.add_nodes([4,5,6], start=0, end=1)
To incorporate the semantics on nodes, ash
leverages the NProfile
class.
This structure can be used to contain the attributes of a node:
from ash_model import NProfile
profile = NProfile(node_id=1, name='Alice', party='L') # add attributes at creation
profile.add_attribute('age', 24) # add attribute with dedicated method
Node profiles can be incorporated in the ASH
when adding a new node. This allows to model change in the node attributes by adding the same node at different times with different profiles:
profile = NProfile(node_id=1, name='Alice', party='L', age=24)
h.add_node(1, start=0, attr_dict=profile)
profile2 = NProfile(node_id=1, name='Alice', party='R', age=25)
h.add_node(1, start=1, attr_dict=profile2)
To see all information on a node profile through time, you can use the following:
profile = h.get_node_profile(1)
attrs = profile.get_attributes()
print(attrs)
>>> {'t': [[0, 1]], 'name': {0: 'Alice', 1: 'Alice'}, 'party': {0: 'L', 1: 'R'}, 'age': {0: 24, 1: 25}}
Time Respecting s-Walks¶
ash_model
integrates the s-analysis framework to generalize classic graph measures to hypergraphs, and extends it with temporal information.
In short, walks are extended to hypergraphs based on a parameter s
, which controls the minimum number of common nodes between subsequent hyperedges.
For instance:
from ash_model import ASH
from ash_model import paths
h = ASH()
h.add_hyperedge([1,2,3], start=0) # e1
h.add_hyperedge([2,3,4], start=0) # e2
h.add_hyperedge([4,5,6], start=0) # e3
h.add_hyperedge([1,2,3,4,5, 6], start=0) # e4
s = 2
paths.has_s_walk(h, s, "e1", "e3")
>>> True
paths.shortest_s_walk(h, s, "e1", "e3")
>>> ['e1', 'e4', 'e3']
paths.average_s_distance(h, s=2)
>>> 1.6666666666666667
For more info on the s-analysis framework please refer to the original paper. The novelty with ASH
is the extension to the dynamic scenario.
This means that ASH
is able to compute time-respecting s-walks.
h = ASH()
h.add_hyperedge([1, 2, 3], 0, 4)
h.add_hyperedge([1, 4], 0, 1)
h.add_hyperedge([1, 2, 3, 4], 2, 3)
trsw = paths.time_respecting_s_walks(h, s=1, hyperedge_from='e1', hyperedge_to='e3')
for _, ap in trsw.items():
walks = paths.annotate_walks(ap)
print('Shortest:', walks['shortest'])
print('Foremost', walks['foremost'])
In the above example, first we get all time respecting s-walks via time_respecting_s_walks()
; then we annotate each path and print the fastest and the foremost ones.
Other choices include fastest, heaviest, and all possible combinations.
High-order Mixing Patterns¶
Visual Analytics¶
ash
also provides a module that facilitates visualizing both static and temporal statistics.
from ash_model import viz
from matplotlib import pyplot as plt
ax = viz.plot_hyperedge_size_distribution(h)
ax.set_title('Hyperedge size distribution')
plt.show()
All functions here are a wrapper around either matplotlib
or seaborn
commands, and return the plt.Axes
object associated with the plot.
You can also pass a **kwargs
dictionary with matplotlib
plotting parameters (e.g., marker
).
Since all functions return the plt.Axes
object, you can customize it to your likings like the above.
To facilitate visualizing time series data emerging from the ASH
, the library provides two methods, one for the structural features,
one for the attribute-related dynamics. Let’s start with the former:
from ash_model import measures
func = measures.average_s_local_clustering_coefficient # function to be applied to all snapshots
func_params = {'s': 2} # parameters of the function above, if any
rolling = 3 # optional rolling window size for smoothing the time series
ax = viz.plot_structure_dynamics(h, func, func_params, rolling)
ax.set_title('Number of hyperedges in time')
plt.show()
- A couple of things to note here:
func
is aCallable
object that is applied to all snapshots.func_params
allows to pass arguments tofunc
. If the only parameters of a function areh
and/ortid
, orstart
, andend
, this parameter can be ignored/set toNone
.rolling
is an optional parameter that is used to smooth the time series. In case you have high-resolution temporal data, it is best to set this parameter; it basically computes the mean overrolling
timestamps.
Instead, for attribute dynamics we have – you guessed it: plot_attribute_dynamics()
attr_name = "party" # compute measure for this attribute
func = measures.average_hyperedge_profile_purity # function to be applied to all snapshots
func_params = {'by_label': True} # optional parameters of the function above
rolling = 3 # optional rolling window size for smoothing the time series
ax = viz.plot_attribute_dynamics(h, attr_name, func, func_params, rolling)
ax.set_title('Average hyperedge purity (by label)')
plt.show()
The main differences here is that we have to set the attribute name for which we want to compute the statistics.
In conclusion, if you want to compare different dynamic hypernetworks, you can do something like:
hs = [h0, h1, h2] # three different ASH instances
attr_name = "party" # compute measure for this attribute
func = measures.average_hyperedge_profile_purity # function to be applied to all snapshots
func_params = {'by_label': True} # optional parameters of the function above
rolling = 3 # optional rolling window size for smoothing the time series
fig, axs = plt.subplots(nrows=1, ncols=len(hs), figsize=(18, 6), sharey=True, sharex=True)
plt.subplots_adjust(hspace=0.8)
for h, ax in zip(hs, axs.ravel()):
viz.plot_attribute_dynamics(h, attr_name, func, func_params, rolling, ax=ax)
ax.grid(alpha=.2)
plt.suptitle('Average hyperedge purity (by label)', fontsize=32)
plt.show()
Reading and Writing¶
ash
allows to read/write hypernetworks in two ways, either:
from/to separate files for interactions and node profiles
from/to a single file
The former format allows to read/write interactions and node profiles separately. For the hyperedges:
from ash_model import readwrite as io
io.write_sh_to_csv(h, filename)
h = io.read_sh_from_csv(filename)
where each row is a tab-separated list of the form [nodes -> start, end]
, with nodes
being the members of each interactions, and start, end
the appearance-disappearance times of the interactions.
Likewise, for the profiles:
io.write_profiles_to_csv(h, filename)
profiles = io.read_profiles_from_csv(filename)
where each row is a comma-separated list of the form [node_id, tid, attr1, attr2...]
. Nodes that appear multiple times are stored as a separate row for each time they are active.
Profiles can also be stored as a jsonl file with similar structure:
io.write_profiles_to_jsonl(h, filename)
profiles = io.read_profiles_from_jsonl(filename)
The latter format encloses the whole ASH
in a json
file
io.write_ash_to_json(h, filename)
h = io.read_ash_from_json(filename)