今天要分享的是最近小小學的課程,
Basics of NetworkX API, using Twitter network
To get you up and running with the NetworkX API, we will run through some basic functions that let you query a Twitter network that has been pre-loaded for you and is available in the IPython Shell as T
. The Twitter network comes from KONECT, and shows a snapshot of a subset of Twitter users. It is an anonymized Twitter network with metadata.
Basic drawing of a network using NetworkX
NetworkX provides some basic drawing functionality that works for small graphs. We have selected a subset of nodes from the graph for you to practice using NetworkX’s drawing facilities. It has been pre-loaded as T_sub
.
# Import necessary modules import matplotlib.pyplot as plt import networkx as nx # Draw the graph to screen nx.draw(T_sub) plt.show()
Queries on a graph
Specifically, you’re going to look for “nodes of interest" and “edges of interest".The .nodes()
method returns a list of nodes, while the .edges()
method returns a list of tuples, in which each tuple shows the nodes that are present on that edge.
# Use a list comprehension to get the nodes of interest: noi noi = [n for n, d in T.nodes(data=True) if d['occupation'] == 'scientist'] # Use a list comprehension to get the edges of interest: eoi eoi = [(u, v) for u, v, d in T.edges(data=True) if d['date'] < date(2010, 1, 1)]
Specifying a weight on edges
Weights can be added to edges in a graph, typically indicating the “strength" of an edge. In NetworkX, the weight is indicated by the 'weight'
key in the metadata dictionary.
# Set the weight of the edge T.edge[1][10]['weight'] = 2 # Iterate over all the edges (with metadata) for u, v, d in T.edges(data=True): # Check if node 293 is involved if 293 in [u, v]: # Set the weight to 1.1 T.edge[u][v]['weight'] = 1.1
Checking whether there are self-loops in the graph
As Eric discussed, NetworkX also allows edges that begin and end on the same node; while this would be non-intuitive for a social network graph, it is useful to model data such as trip networks, in which individuals begin at one location and end in another.
# Define find_selfloop_nodes() def find_selfloop_nodes(G): """ Finds all nodes that have self-loops in the graph G. """ nodes_in_selfloops = [] # Iterate over all the edges of G for u, v in G.edges(): # Check if node u and node v are the same if u==v: # Append node u to nodes_in_selfloops nodes_in_selfloops.append(u) return nodes_in_selfloops
Visualizing using Matrix plots
nxviz
is a package for visualizing graphs in a rational fashion. A corresponding nx.from_numpy_matrix(A)
allows one to quickly create a graph from a NumPy matrix. The default graph type is Graph()
; if you want to make it a DiGraph()
, that has to be specified using the create_using
keyword argument, e.g. (nx.from_numpy_matrix(A, create_using=nx.DiGraph)
).
# Import nxviz import nxviz as nv # Create the MatrixPlot object: m m = nv.MatrixPlot(T) # Draw m to the screen m.draw() # Display the plot plt.show() # Convert T to a matrix format: A A = nx.to_numpy_matrix(T) # Convert A back to the NetworkX form as a directed graph: T_conv T_conv = nx.from_numpy_matrix(A, create_using=nx.DiGraph()) # Check that the `category` metadata field is lost from each node for n, d in T_conv.nodes(data=True): assert 'category' not in d.keys()
Visualizing using Circos plots
Circos plots are a rational, non-cluttered way of visualizing graph data, in which nodes are ordered around the circumference in some fashion, and the edges are drawn within the circle that results, giving a beautiful as well as informative visualization about the structure of the network
# Import necessary modules import matplotlib.pyplot as plt from nxviz import CircosPlot # Create the CircosPlot object: c c = CircosPlot(T) # Draw c to the screen c.draw() # Display the plot plt.show()
Visualizing using Arc plots
Two keyword arguments that you will try here are node_order='keyX'
and node_color='keyX'
, in which you specify a key in the node metadata dictionary to color and order the nodes by.
# Import necessary modules import matplotlib.pyplot as plt from nxviz import ArcPlot # Create the un-customized ArcPlot object: a a = ArcPlot(T) # Draw a to the screen a.draw() # Display the plot plt.show() # Create the customized ArcPlot object: a2 a2 = ArcPlot(T,node_order='category',node_color='category') # Draw a2 to the screen a2.draw() # Display the plot plt.show()
resources: DataCamp, network analysis in python I