**Social network analysis** (SNA) is the methodical analysis of social networks. Social network analysis views social relationships in terms of network theory, consisting of *nodes* (representing individual actors within the network) and *ties* (which represent relationships between the individuals). These networks are often depicted in a social network diagram, where nodes are represented as points and ties are represented as lines.

*Example of a social network diagram*

Relationships in a network can either be **directional** or **nondirectional**. In a directional relationship, one person is the initiator (or source of the relationship) while the other is the receiver (or destination of the relationship). For example, in the diagram above, node 1269 is the source while node 3777 is the destination. Relationships can also be described as **dichotomous** or **valued**. A dichotomous relationship is one where the only information that exists is whether or not a relationship exists between two people, where as in a valued relationship, a weight indicating the strength of the relationship is also available. To understand this better, let us look at the following data set from a fictional telecommunications company:

A social diagram representing one of the nodes (3777) in this data set is as follows:

In this diagram, the nodes are marked in dark colors while the weight of the relationship between two nodes are marked in light colors. Since all the relationships in this diagram have weights, all relationships are valued and none of them are dichotomous. Also, all these relationships appear to be directional with a clear source and a clear destination.

Two common metrics used to describe social networks are **density** and **degree**. Both these metrics represent connectivity but density focuses on the entire network or communities within the network where as degree focuses on the individuals within the network.**Network density**

Density is represents the proportion of possible relationships in a network that are actually present. The value ranges from 0 to 1; the closer the value is to 0, the sparser the network is while the closer the value is to 1, the denser the network is.

The number of possible relationships in a network is calculated using the formula:

where n = the number of nodes in the network and 2 is the maximum number of relationships possible between any two nodes in the network.

So for example, in a network containing 3 nodes, the maximum number of possible relationships is:

Assuming there are 3 relationships in this network, the density is 3 / 6 or 0.5.

Similarly, in a network containing 5 nodes, the maximum number of possible relationships is:

Assuming there are only 4 relationships in this network, the density is 4 / 20 = 0.2.

Therefore the first network is denser than the second network (since 0.5 > 0.2).**Nodal degree**

Nodal degree is defined as the total number of relationships involving that node. Degree can be broken into two parts: **indegree** and **outdegree**. Indegree is the number of relationships in which a particular node is the target where outdegree is the number of relationships in which a particular node is the source.

The following table illustrates nodal degrees in a 7 node relationship:

*(Example sourced from SPSS SNA User Guide)*

In the table above, A's degree measure is 3 split as indegree = 0 and outdegree = 3. This means that A is the source of 3 relationships in the network whereas B who has a degree measure of 1 (indegree = 1 and outdegree = 0) is the destination of 1 relationship in the network.**Indegree is often treated as a measure of prestige**. Higher indegree values correspond to more relationships ending at that node. **In other words, those individuals are contacted by a high number ****of other individuals**. Many other nodes are initiating relationships with the node. **Conversely, ****outdegree is treated as a measure of centrality**. Higher values correspond to more relationships originating from that node. **Those individuals contact a high number of other individuals**. For the nodes in the example network, the degree values indicate that nodes A and D are the most active while nodes B and E are the least active. The indegree values reveal that node G has the most prestige. Based on the outdegree values, node A is the most central [Source: SPSS SNA User Guide].**Examining social networks using KXEN's Infinite Insight**

KXEN's Infinite Insight has a highly advanced social network analysis module. Using the fictional call records data set indicated above, we can generate a social network diagram in just a few simple steps using Infinite Insight. Having generated the social network diagram, there are two ways to examine it:

* Top down

* Bottom up

Let us understand that top down approach first.

The top down view of the voice calls made during the month of May from the fictional data set is as follows:

This view shows us that there are a lot of relationships in the network and green circles represent super communities of users grouped together based on similar characteristics. The next level down appears as follows:

The diagram above displays a closer look at community # 10895 and its relationships with other communities in the network. You can drill down five levels in Infinite Insight all the way from the bird's eye view down to specific nodes in the network as shown below:

Infinite Insight also gives you the ability to adopt a bottom up approach and start at the node level and build up the social network. For example, we can start by examining node # 2218 and build the social network from the bottom up.

The visualization features within Infinite Insight give you the ability to color code variables of interest such as churn. Starting from the top down view, if you color code communities likely to churn in a specific color, you can then navigate down to specific nodes to understand exactly what is occurring and come up with action plans to reduce churn risk.**Applications of Social Network Analysis**

Use cases for social network analysis are varied and include marketing as well as risk and fraud detection. Marketing applications include customer churn prediction and launching marketing campaigns. Since group characteristics can influence churn rates, you may be able to predict and prevent churn by using social network analysis to better understand group behavior and thereby individual behavior. Similarly by identifying influencers within groups, you can launch marketing campaigns. The influence of a group member make make other members more likely to purchase the offering. On the risk and fraud detection front, social network analysis can be used to detect money laundering and credit card fraud (using merchant-buyer patterns).

© 2019 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central