NetworkX Analysis

In this homework, you will extend your work from previous homeworks and analyze your social network using the tools in NetworkX.

1. Risk and Metro Map Analysis

Use the code provided in class to analyze the Risk and Metro map data to answer the following questions.

  1. List the top 5 nodes by degree, closeness, and betweenness centrality. [3 pts]

    a. For the Risk Map [0.1 points per country per metric, total 1.5 pts]

    - Degree: Ukraine : 0.1463, East Africa : 0.1463, North Africa : 0.1463, China : 0.1463, Middle East : 0.1463
    - Closeness: Ukraine : 0.3203, Middle East : 0.3083, Afghanistan : 0.3037, Southern Europe : 0.2993, Ural : 0.2950
    - Betweenness: North Africa : 0.2133, Middle East : 0.2048, China : 0.1960, Ukraine : 0.1816, Siam : 0.1805

    b. For the Metro Map [0.1 points per country per metric, total 1.5 pts]

    - Degree: L'Enfant Plaza : 0.2778, Metro Center : 0.2222, Fort Totten : 0.1667, Gallery Place : 0.1667, Stadium Armory : 0.1667, 
    - Closeness: L'Enfant Plaza : 0.4500, Metro Center : 0.4186, Pentagon : 0.4186, Gallery Place : 0.4000, Rosslyn : 0.3913, 
    - Betweeness: L'Enfant Plaza : 0.5163, Pentagon : 0.3595, Rosslyn : 0.3203, Gallery Place : 0.2941, Metro Center : 0.2810,
  2. Describe your observations about the overlap among these top-5 centrality sets. [2 pts]

    You could make many observations here. For one, in the Risk map, we see differences across the three centrality metrics, whereas in the Metro map, L'Enfant Plaza is central in all cases. We could surmise from these metrics that L'Enfant is a more critical component of the metro network than any of the countries are to the Risk map. Another potential observation may be the variability in closeness/betweenness versus degree centrality: In the Risk map, all five countries have the same degree centrality, whereas the other metrics have more variance. An interesting relation here is that many different graphs could generate the same degree centrality scores (see configuration models from class 6), but fewer graphs would generate these same closeness and betweenness scores.

    Many other observations are also possible here.

2. Social Network Analysis

Answer these questions using the data you collected for Homework 3.

An easy way to get your data into NetworkX is to take your visualization from Gephi and export it to a .graphml file, which you can then read in to NetworkX using nx.read_graphml("filename.graphml").

  1. List the top 5 nodes in your social network by degree, closeness, and betweenness centrality. [1 pts]

  2. Briefly describe your observations about the overlap among these sets. Are they all the same? What justification can you provide for their centrality scores and order? [2 pts]

  3. How tightly clustered is your social graph? To answer this question, provide the density and clustering coefficient for your graph. [2 pts]

3. Creating Another Network

Using either the Twitter collection notebook or the Wikipedia notebook provided in class, create a new graphml file to answer the following questions.

For Twitter, consider creating a network of friends around a popular celebrity's account, your own account, or an organization in which you are interested (e.g., NASA or NOAA).

  1. Create and turn in a visualization of this graph using Gephi. [5 pts]

  2. Identify the top 5 most interesting nodes in the graph. [2 pts]

  3. Describe any group or clustering structure you see in the network. [3 pts]