Community detection in your GitHub’s following network using D3.js and igraph

Discovering underlying network structure is one of the important aspects of Social Network Analysis. It’s an important area of research in the fields where subjected systems are represented as graphs, like Computer Science, Biology, and Sociology.

In this post, we will explore how to detect communities in the GitHub following network of a user using Plotly. We will also learn the basic procedure so that you too can play with a social network of your choice.

To make things easier, we have developed a tool named octogrid that can help you visualize your GitHub network. It’s a Python package that makes it as easy as typing a command.

We start by importing all the required modules and then load the network file.

The file is in GML format, which is a Graph Modelling Language. It represents the graph nodes and the edges between them. Octogrid can automatically generate the GML network file for a given username.

The variable community has a type list, representing community index of all the nodes. We can count the total number of communities by finding unique indices in the list. Once we have communities for all the nodes, we can color them using unique colors for each community.

The rest of the process is similar to Python Network Graph Comparison. Here is the code for it, only if you promise to read the post to the end.

Let’s explore the anatomy of communities for my following network, generated with octogrid.

There are 4 communities in this network. The one in the “dark-green” color represents the members of SDSLabs, a student group responsible for technical innovations at my campus. Then there is the community in “red” color, which represents people from my campus but not in SDSLabs.

The one in the “light-green” is the set of users who follow people similar to me. And the “blue” one represents the people with relatively lesser connections in this network.

By definition, communities are groups of vertices sharing common properties. In this case, where we are only considering the edges of a node, a community consists of all the nodes having relatively dense connections between the community nodes, compared to other nodes out of the community.

We can observe the same phenomena in the “SDSLabs” community, where every member is connected to (almost all) fellow students of the team.

The algorithm being used for community detection by igraph was proposed in the paper, Fast unfolding of communities in large networks [Blondel et al.]. It consists of two phases and iterate them over and over until the maximum of network modularity is obtained.

You can install octogrid using pip.

You can generate the GML network file for a user using the following command.

If you want to publish the plotted network using Plotly.

We have open sourced octogrid, if you want to build something impressive using it. Happy Plotting!!