In every organization, people build and rely on informally-built networks seeking for information, advice, and collaborations. Often the invisible people networks are different from the formal organization hierarchy. Uncovering the informal but effective networks and understanding how information in the organization flows become crucial and enormously valuable to organization leaders.

In this article, we will briefly explain what Organization Network Analysis (ONA) is about and how to effectively measure. A small sample dataset is used to demonstrate our ONA experiment and network graph.

This post is part of a series of people analytics experiments:

Job skill match (Recruitment )
Employee attrition prediction (Employee Management)
Pay gap by gender, ethnicity, profession (Employee Compensation) FUTURE WORK
Organizational network analysis (ONA)

Python code can be found on my GitHub.

What is Organizational Network Analysis

Organizational network analysis (ONA) is a method for studying and visualizing how communications, information, and decisions flow within a formal organization. It creates statistical and graphical models of the people and groups of an organization.

Organizational network/graph consists of nodes and edges. Nodes are people in the organization, and edges are connections between people. When two people exchange information that is needed, a connection is established and one person makes influence on the other.

Often, ONA reveals that organization collaboration and influence are quite different from the formal reporting structures. Understanding this invisible network is essential for all aspects of effective organization operations. ONA helps organization leaders uncover hidden stars, increase operational effectiveness, prevent collaboration overload, and drive innovation.

How to Measure People/Node Centrality

To answer our question who is the most prominent employee in the network, first we must define some valid centrality metrics. There are four widely used centrality metrics: Degree, Eigenvector, Closeness, and Betweenness.

Degree Centrality

The degree of a node is defined as the number of connecting edges that it has. In the case of a directed network where edges have direction, degree centrality is normally measured by indegree and outdegree.

Metrics of degree centrality can be interpreted as popularity.

Eigenvector Centrality (Eigencentrality)

It assigns relative scores to all nodes in the network based on the concept that connections to high-scoring nodes contribte more to the score of the node in focus than equal connections to low-scoring nodes. Google’s PageRank is an example of Eigencentrality.

Metrics of eigenvector centrality can be interpreted as influence.

Closeness Centrality

The closeness centrality of a node is the average length of the shortest path between the node and all other nodes in the graph. The more central a node is, the closer it is to all other nodes.

Metrics of closeness centrality can be interpreted as centralness.

Betweenness Centrality

Betweenness is a centrality measure of a node within a graph. It quantifies the number of times a node acts as a bridge along the shortest path between two other nodes. A bridge in a social network is someone who connects two different social groups.

Metrics of eigenvector centrality can be interpreted as bridge.

Other centrality measures are also available, such as Katz, PageRank, Cross-clique. For more detailed explanations, go to wiki and also this article.

Dataset

Dataset used in this analysis was first published by Cross and Parker in 2004 and it contains survey results from 46 employees.

Survey question:

“Please indicate how often you have turned to this person for information or advice on work-related topics in the past three months”.

0: I Do Not Know This Person; 1: Never; 2: Seldom; 3: Sometimes; 4: Often; and 5:Very Often.

Sample records look like this:

employee 1	employee 2	score
1	3	5
1	8	3
1	9	3
1	12	3
1	15	2

Our employee influence network is a direct graph, i.e. an arrow line from employee e1 to e2 means that e1 has some “influence” on e2.

In this sample network, there are 46 people and 879 links between employees.

Our goal is to find the people that have most influential power within the organization.

Experiment

We chose the eigenvector centrality metric on the sample data set, as we are more interested in finding out who have the most influence in the organization. The higher the value, the more influence one has.

Below are eigencentrality scores for the top and bottom 3 employees.

Top 3		Bottom 3
employee	eigencentrality	employee	eigencentrality
6	1.00	46	0.06
26	0.96	24	0.01
45	0.91	32	0.01

NetworkX is a Python package for studying structure, dynamics, and functions of complex networks. We use it to plot our organization network graph.

Red node (#6) represents the employee who has the most influence in the org.
Influence power is reflected by the size of the node.
Edge color shows influence score (1 to 5) of one person on another.

Organizational Network Graph by NetworkX

Using the same data set, we also plot an interactive 3D network graph in igraph and Plotly A full-page version can be found here.

Future Works

Add employee attributes e.g. age, gender, department to the analysis, so patterns can be identified in these areas.
Compute other relevant centrality metrics

Again, Python code can be found on my GitHub.

Happy Machine Learning!