ENPM809G - NetworkX

Introductions to the NetworkX Python package for network analysis

In [1]:
# Magic line to ensure plotting happens in Jupyter
%matplotlib inline

# PyPlot is an object-oriented plot interface to matplotlib
import matplotlib.pyplot as plt 
In [2]:
# Load up the networkx package
import networkx as nx

Creating Graphs With NetworkX

NetworkX supports several graph constructs: undirected, directed, multigraph, and directed multigraph.

Creating a Simple Graph

In [3]:
g = nx.Graph()

Currently, this graph g is empty. It has no nodes (i.e., vertices) and no edges (i.e., links).

We'll need to add nodes and edges to the graph to analyze to recreate our toy graph.

In [4]:
# Recreate our toy example graph
g.add_node(0)
g.add_node(1)
g.add_node(2)
g.add_node(3)
g.add_node(4)
g.add_node(5)
g.add_node(6)
g.add_node(7)
g.add_node(8)
g.add_node(9)
g.add_node(10)

# Add edges
g.add_edge(0, 9)
g.add_edge(0, 8)
g.add_edge(0, 3)
g.add_edge(0, 2)
g.add_edge(0, 1)
g.add_edge(0, 4)
g.add_edge(0, 5)
g.add_edge(1, 2)
g.add_edge(5, 1)
g.add_edge(5, 2)
g.add_edge(5, 3)
g.add_edge(5, 4)
g.add_edge(5, 6)
g.add_edge(6, 7)
g.add_edge(7, 10)
In [6]:
print(len(g.nodes()))
print(len(g.edges()))
11
15

We can use NetworkX for rudimentary drawing too.

In [ ]:
 
In [8]:
nx.draw(g, with_labels=True)

plt.show()

It's easy to iterate through nodes and edges of a graph...

In [9]:
print("Nodes:")
for node in g.nodes():
    print("\tNode:", node)
    
print("Edges:")
for edge in g.edges():
    print("\tEdge:", edge)
Nodes:
	Node: 0
	Node: 1
	Node: 2
	Node: 3
	Node: 4
	Node: 5
	Node: 6
	Node: 7
	Node: 8
	Node: 9
	Node: 10
Edges:
	Edge: (0, 9)
	Edge: (0, 8)
	Edge: (0, 3)
	Edge: (0, 2)
	Edge: (0, 1)
	Edge: (0, 4)
	Edge: (0, 5)
	Edge: (1, 2)
	Edge: (1, 5)
	Edge: (2, 5)
	Edge: (3, 5)
	Edge: (4, 5)
	Edge: (5, 6)
	Edge: (6, 7)
	Edge: (7, 10)

Adding duplicate nodes or edges to a standard Graph object will have no effect.

In [10]:
g.add_node(0)
g.add_edge(7, 10)

print("Nodes:")
for node in g.nodes():
    print("\tNode:", node)
    
print("Edges:")
for edge in g.edges():
    print("\tEdge:", edge)
Nodes:
	Node: 0
	Node: 1
	Node: 2
	Node: 3
	Node: 4
	Node: 5
	Node: 6
	Node: 7
	Node: 8
	Node: 9
	Node: 10
Edges:
	Edge: (0, 9)
	Edge: (0, 8)
	Edge: (0, 3)
	Edge: (0, 2)
	Edge: (0, 1)
	Edge: (0, 4)
	Edge: (0, 5)
	Edge: (1, 2)
	Edge: (1, 5)
	Edge: (2, 5)
	Edge: (3, 5)
	Edge: (4, 5)
	Edge: (5, 6)
	Edge: (6, 7)
	Edge: (7, 10)

Node and Edge Attributes

We can add information to nodes and edges that help us keep track of information, labels, or weights.

In [11]:
# A dictionary with information for nodes
#  to which we will add attributes
node_attribute_dict = {
    0: {"label": "center"},
    5: {"label": "important"}
}

# Set the attributes
nx.set_node_attributes(g, node_attribute_dict)

# If you just print the data node list, you don't see the associated
#  attributes
print("Nodes without data:")
for node in g.nodes():
    print("\tNode:", node)
    
# Need to use the data=True named argument
print("Nodes w/ data:")
for node in g.nodes(data=True):
    print("\tNode:", node)
Nodes without data:
	Node: 0
	Node: 1
	Node: 2
	Node: 3
	Node: 4
	Node: 5
	Node: 6
	Node: 7
	Node: 8
	Node: 9
	Node: 10
Nodes w/ data:
	Node: (0, {'label': 'center'})
	Node: (1, {})
	Node: (2, {})
	Node: (3, {})
	Node: (4, {})
	Node: (5, {'label': 'important'})
	Node: (6, {})
	Node: (7, {})
	Node: (8, {})
	Node: (9, {})
	Node: (10, {})

Doing the same for edges, we can add weights...

In [12]:
# A dictionary with information for edges
#  to which we will add attributes
edge_attribute_dict = {
    (0, 5): {"weight": 5},
    (6, 5): {"weight": 2}, # Note order is irrelevant for undirected graph
}

# Set the attributes
nx.set_edge_attributes(g, edge_attribute_dict)

# If you just print the data node list, you don't see the associated
#  attributes
print("Edges without data:")
for e in g.edges():
    print("\tEdge:", e)
    
# Need to use the data=True named argument
print("Edges w/ data:")
for e in g.edges(data=True):
    print("\tEdge:", e)
Edges without data:
	Edge: (0, 9)
	Edge: (0, 8)
	Edge: (0, 3)
	Edge: (0, 2)
	Edge: (0, 1)
	Edge: (0, 4)
	Edge: (0, 5)
	Edge: (1, 2)
	Edge: (1, 5)
	Edge: (2, 5)
	Edge: (3, 5)
	Edge: (4, 5)
	Edge: (5, 6)
	Edge: (6, 7)
	Edge: (7, 10)
Edges w/ data:
	Edge: (0, 9, {})
	Edge: (0, 8, {})
	Edge: (0, 3, {})
	Edge: (0, 2, {})
	Edge: (0, 1, {})
	Edge: (0, 4, {})
	Edge: (0, 5, {'weight': 5})
	Edge: (1, 2, {})
	Edge: (1, 5, {})
	Edge: (2, 5, {})
	Edge: (3, 5, {})
	Edge: (4, 5, {})
	Edge: (5, 6, {'weight': 2})
	Edge: (6, 7, {})
	Edge: (7, 10, {})

We can also add such attributes when we create the graph.

In [14]:
g.add_node(11, label="extra")
g.add_edge(10, 11, weight=1)
In [15]:
print("Nodes:")
for node in g.nodes(data=True):
    print("\tNode:", node)
    
print("Edges:")
for edge in g.edges(data=True):
    print("\tEdge:", edge)
Nodes:
	Node: (0, {'label': 'center'})
	Node: (1, {})
	Node: (2, {})
	Node: (3, {})
	Node: (4, {})
	Node: (5, {'label': 'important'})
	Node: (6, {})
	Node: (7, {})
	Node: (8, {})
	Node: (9, {})
	Node: (10, {})
	Node: (11, {'label': 'extra'})
Edges:
	Edge: (0, 9, {})
	Edge: (0, 8, {})
	Edge: (0, 3, {})
	Edge: (0, 2, {})
	Edge: (0, 1, {})
	Edge: (0, 4, {})
	Edge: (0, 5, {'weight': 5})
	Edge: (1, 2, {})
	Edge: (1, 5, {})
	Edge: (2, 5, {})
	Edge: (3, 5, {})
	Edge: (4, 5, {})
	Edge: (5, 6, {'weight': 2})
	Edge: (6, 7, {})
	Edge: (7, 10, {})
	Edge: (10, 11, {'weight': 1})

We can also delete nodes.

In [17]:
nx.draw(g, with_labels=True)
plt.show()

g.remove_node(11)

nx.draw(g, with_labels=True)
plt.show()
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/Users/cbuntain/Development/thirdparty/anaconda3/lib/python3.6/site-packages/networkx/classes/graph.py in remove_node(self, n)
    582         try:
--> 583             nbrs = list(adj[n])  # list handles self-loops (allows mutation)
    584             del self._node[n]

KeyError: 11

During handling of the above exception, another exception occurred:

NetworkXError                             Traceback (most recent call last)
<ipython-input-17-591d07a5cac7> in <module>()
      2 plt.show()
      3 
----> 4 g.remove_node(11)
      5 
      6 nx.draw(g, with_labels=True)

/Users/cbuntain/Development/thirdparty/anaconda3/lib/python3.6/site-packages/networkx/classes/graph.py in remove_node(self, n)
    584             del self._node[n]
    585         except KeyError:  # NetworkXError if n not in self
--> 586             raise NetworkXError("The node %s is not in the graph." % (n,))
    587         for u in nbrs:
    588             del adj[u][n]   # remove all edges n-u in graph

NetworkXError: The node 11 is not in the graph.

Graph Metrics

NetworkX supports many graph metrics.

In [18]:
# What is the degree of each node?
g.degree()
Out[18]:
DegreeView({0: 7, 1: 3, 2: 3, 3: 2, 4: 2, 5: 6, 6: 2, 7: 2, 8: 1, 9: 1, 10: 1})
In [19]:
# What is the diameter of the graph?
nx.diameter(g)
Out[19]:
5
In [20]:
# What is the density of the graph?
nx.density(g)
Out[20]:
0.2727272727272727
In [21]:
# What's the global clustering coefficient?
print("Global Coefficient:", nx.transitivity(g))
Global Coefficient: 0.391304347826087
In [22]:
# Local Cluster Coeff for a given node
target = 0
print("Clustering Coefficient:", nx.clustering(g, target))
Clustering Coefficient: 0.23809523809523808
In [23]:
# What's the average LCC?
nx.average_clustering(g)
Out[23]:
0.41558441558441556
In [24]:
# LCC Happens to be the same as the density of 1.5 ego-net w/o target
subg = nx.ego_graph(g, target, 1.5)

# Show the ego graph
nx.draw(subg, with_labels=True)
plt.show()

# Remove the target
subg.remove_node(target)

# show us the resulting graph
nx.draw(subg, with_labels=True)
plt.show()

# What's the density?
print("Density of 1.5-degree ego-net:", nx.density(subg))
Density of 1.5-degree ego-net: 0.23809523809523808

Centrality Metrics

Calculating the various centrality measures is easy.

In [25]:
# Degree centrality
centrality = nx.degree_centrality(g)
[(x, centrality[x]) for x in sorted(centrality, key=centrality.get, reverse=True)]
Out[25]:
[(0, 0.7000000000000001),
 (5, 0.6000000000000001),
 (1, 0.30000000000000004),
 (2, 0.30000000000000004),
 (3, 0.2),
 (4, 0.2),
 (6, 0.2),
 (7, 0.2),
 (8, 0.1),
 (9, 0.1),
 (10, 0.1)]
In [26]:
# Closeness centrality
centrality = nx.closeness_centrality(g)
[(x, centrality[x]) for x in sorted(centrality, key=centrality.get, reverse=True)]
Out[26]:
[(5, 0.6666666666666666),
 (0, 0.625),
 (1, 0.5),
 (2, 0.5),
 (6, 0.5),
 (3, 0.47619047619047616),
 (4, 0.47619047619047616),
 (8, 0.4),
 (9, 0.4),
 (7, 0.37037037037037035),
 (10, 0.2777777777777778)]
In [27]:
# Betweenness centrality
centrality = nx.betweenness_centrality(g)
[(x, centrality[x]) for x in sorted(centrality, key=centrality.get, reverse=True)]
Out[27]:
[(5, 0.5222222222222223),
 (0, 0.43333333333333335),
 (6, 0.35555555555555557),
 (7, 0.2),
 (1, 0.0),
 (2, 0.0),
 (3, 0.0),
 (4, 0.0),
 (8, 0.0),
 (9, 0.0),
 (10, 0.0)]

Centrality as Node Attributes

We can even store these centrality values as node attributes for use later...

In [28]:
nx.set_node_attributes(g, centrality, "centrality")

print("Nodes:")
for node in g.nodes(data=True):
    print("\tNode:", node)
Nodes:
	Node: (0, {'label': 'center', 'centrality': 0.43333333333333335})
	Node: (1, {'centrality': 0.0})
	Node: (2, {'centrality': 0.0})
	Node: (3, {'centrality': 0.0})
	Node: (4, {'centrality': 0.0})
	Node: (5, {'label': 'important', 'centrality': 0.5222222222222223})
	Node: (6, {'centrality': 0.35555555555555557})
	Node: (7, {'centrality': 0.2})
	Node: (8, {'centrality': 0.0})
	Node: (9, {'centrality': 0.0})
	Node: (10, {'centrality': 0.0})

Visualization with NetworkX

In general, NetworkX isn't good for visualizations, but it does have some rudimentary capabilities and layout algorithms.

In [29]:
# Draw with a spring layout
nx.draw(g, with_labels=True)
plt.show()
In [30]:
nx.draw_spring(g, with_labels=True)
plt.show()
In [31]:
nx.draw_random(g, with_labels=True)
plt.show()
In [32]:
nx.draw_circular(g, with_labels=True)
plt.show()
In [33]:
nx.draw_shell(g, with_labels=True)
plt.show()
In [34]:
nx.draw_spectral(g, with_labels=True)
plt.show()

NetworkX separates drawing from layout as well, as we see here.

In [35]:
# Construct node positions using FR layout (as in Gephi)
node_positions = nx.fruchterman_reingold_layout(g)

# Draw with the given node positions specified
nx.draw(g, pos=node_positions, with_labels=True)
plt.show()

You can get more advanced, but maybe we should use Gephi at this point...

In [36]:
pos = nx.spring_layout(g, iterations=200)

# Randomly color
nx.draw(g, pos, node_color=range(len(g.nodes())), node_size=800, cmap=plt.cm.Blues)
plt.show()
In [37]:
# Color by centrality
max_c = max(centrality.values())
color_map = {x[0]:x[1]/max_c for x in centrality.items()}

nx.draw(g, pos, node_color=list(color_map.values()), node_size=800, cmap=plt.cm.hot)
plt.show()

Exporting from NetworkX

Getting into Gephi brings up a good question.

How do we export a graph from NetworkX?

NetworkX has lots of reading and writing capabilities:

In [38]:
# For simplicity, let's use GraphML
nx.write_graphml(g, "toy.graphml")
In [ ]: