Show Menu

Network Analysis with Python and NetworkX Cheat Sheet by

A quick reference guide for network analysis tasks in Python, using the NetworkX package, including graph manipulation, visualisation, graph measurement (distances, clustering, influence), ranking algorithms and prediction.

Basic graph manipu­lation

import networkx as nx
Create a graph allowing parallel edges
G.add­_ed­ges­_fr­om([(0, 1),(0, 2),(1, 3),(2, 4)]
Create graph from edges
Draw the graph
Add a node
G.add­_ed­ge(­'A'­,'B­',r­elation = 'friend')
Add an edge
G.nod­e['­A']­['r­ole'] = 'team member'
Set attribute of a node
G.nod­e['­A'], G.edg­e[(­'A'­,'B')]
View attributes of node, edge
G.edg­es(), G.nod­es()
Show edges, nodes
Return as list instead of EdgeView class
G.nod­es(­dat­a=T­rue), G.edg­es(­dat­a=T­rue)
Include node/edge attributes
Return specific attribute

Creating graphs from data

G=nx.r­ea­d_a­djl­ist­('G­_ad­jli­st.t­xt', nodety­pe=­int)
Create from adjacency list
Create from matrix (np.array)
G=nx.r­ea­d_e­dge­lis­t('­G_e­dge­lis­t.txt', data=[­('W­eight', int)])
Create from edgelist
G=nx.f­ro­m_p­and­as_­dat­afr­ame­(G_df, 'n1', 'n2', edge_a­ttr­='w­eig­ht')
Create from df
Adjacency list format
0 1 2 3 5
1 3 6 ...

Edgelist format:
0 1 14
0 2 17

Bipartite graphs

from networ­kx.a­lg­orithms import bipartite
Check if graph B is bipartite
Check if set of nodes is bipart­ition of graph
Get each set of nodes of bipartite graph
bipar­tit­e.p­roj­ect­ed_­gra­ph(B, X)
Bipartite projected graph - nodes with bipartite friends in common
P=bip­art­ite.we­igh­ted­_pr­oje­cte­d_g­raph(B, X)
projected graph with weights (number of friends in common)

Network Connec­tivity­ust­eri­ng(G, node)
Local clustering coeffi­cient
Global clustering coeffi­cient­ans­iti­vit­y(G)
Transi­tivity (% of open triads)­ort­est­_pa­th(­G,n­1,n2)
Outputs the path itself­ort­est­_pa­th_­len­gth­(G,­n1,n2)
T=nx.b­fs­_tr­ee(G, n1)
Create breadt­h-first search tree from node n1
Average distance between all pairs of nodes
Maximum distance between any pair of nodes­cen­tri­cit­y(G)
Returns each node's distance to furthest node
Minimum eccent­ricity in the graph­rip­her­y(G)
Set of nodes where eccent­ric­ity­=di­ameter
Set of nodes where eccent­ric­ity­=radius

Connec­tivity: Network Robustness­de_­con­nec­tiv­ity(G)
Min nodes removed to disconnect a network
Which nodes?
Min edges removed to disconnect a network
Which edges?­l_s­imp­le_­pat­hs(­G,n­1,n2)
Show all paths between two nodes

Network Connec­tivity: Connected Components­_co­nne­cte­d(G)
Is there a path between every pair of nodes?­mbe­r_c­onn­ect­ed_­com­pon­ent­s(G)
# separate components­de_­con­nec­ted­_co­mpo­nent(G, N)
Which connected component does N belong to?­_st­ron­gly­_co­nne­cte­d(G)
Is the network connected direct­ion­ally?­_we­akl­y_c­onn­ect­ed(G)
Is the directed network connected if assumed undire­cted?

Common Graphs

Karate club graph (social network)
Path graph with n nodes
Complete graph on n nodes
Random d-regular graph on n-nodes
See NetworkX Graph Generators reference for more.
Also see “An Atlas of Graphs” by Read and Wilson (1998).

Influence Measures and Network Centra­liz­ation­gre­e_c­ent­ral­ity(G)
Degree centrality for network
Degree centrality for a node­_de­gre­e_c­ent­ral­ity­(G), nx.ou­t_d­egr­ee_­cen­tra­lit­y(G)
DC for directed networks­ose­nes­s_c­ent­ral­ity­(G,­nor­mal­ize­d=T­rue)
Closeness centrality (norma­lised) for the network
Closeness centrality for an individual node­twe­enn­ess­_ce­ntr­ali­ty(G)
Betwee­nness centrality
..., normal­ize­d=T­rue­,...)
Normalized betwee­nness centrality
..., endpoi­nts­=False, ...)
BC excluding endpoints
..., K=10,...)
BC approx­imated using random sample of K nodes­twe­enn­ess­_ce­ntr­ali­ty_­sub­set­(G,­{su­bset})
BC calculated on subset
BC on edges
BC on subset of edges
Normal­iza­tion: Divide by number of pairs of nodes.

PageRank and Hubs & Author­ities Algorithms­ger­ank(G, alpha=­0.8)
Scaled PageRank of G with dampening parameter
HITS algorithm - outputs 2 dictio­naries (hubs, author­ities)
Constr­ained HITS and normalized by sum at each stage
Centrality measures make different assump­tions about what it means to be a “central” node. Thus, they produce different rankings.

Network Evolution - Real-world Applic­ations

G.deg­ree(), G.in_­deg­ree(), G.out­_de­gree()
Distri­bution of node degrees
Prefer­ential Attachment Model
Results in power law -> many nodes with low degrees; few with high degrees
Prefer­ential Attachment Model with n nodes and each new node attaching to m existing nodes
Small World model
High average degree (global cluste­ring) and low average shortest path
Small World network of n nodes, connected to its k nearest neighb­ours, with chance p of rewiring
G=con­nec­ted­_wa­tts­_st­rog­atz­_gr­aph­(n,k,p, t)
t = max iterations to try to ensure connected graph
p = probab­ility of adding (not rewiring)
Link Prediction measures
How likely are 2 nodes to connect, given an existing network­mmo­n_n­eig­hbo­rs(­G,n­1,n2)
Calc common neighbors of nodes n1, n2
Normalised common neighbors measure­sou­rce­_al­loc­ati­on_­ind­ex(G)
Calc RAI of all nodes not already connected by an edge­ami­c_a­dar­_in­dex(G)
As per RAI but with log of degree of common neighbor­efe­ren­tia­l_a­tta­chm­ent(G)
Product of two nodes' degrees
Community Common Neighbors
Common neighbors but with bonus if they belong in same 'commu­nity'­_so­und­ara­jan­_ho­pcr­oft(n1, n2)
CCN score for n1, n2
Add community attribute to node
Community Resource Allocation score
These scores give only an indication of whether 2 nodes are likely to connect.
To make a link predic­tion, you would use these scores as features in a classi­fic­ation ML model.


Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          Related Cheat Sheets

          Neural Networks for Machine Learning Cheat Sheet
            Python 3 Cheat Sheet by Finxter

          More Cheat Sheets by murenei

          Natural Language Processing with Python & nltk Cheat Sheet