When we talk of Influence, the general problem setting is that of identifying a set of nodes in a social network that can be considered to be "Opinion Leaders": Individuals whose opinions can cause others around them to change their viewpoint or adopt a technology or product.
The paper by Kempe et al. presents a technique for maximizing the spread of Influence in a social network. The algorithm presented in this paper is a greedy algorithm for selecting N best nodes which if populated with some information could spread it to a large number of nodes in the network through a simple diffusion process. Although in real world social networks it is often unclear how the diffusion process really takes place or how to determine the correct values of the thresholds wherein a node adopts an idea if enough of it's neighbors have also adopted the idea. Jure et al have presented a different approach and a solution for selecting nodes to detect outbreaks in the networks through the analysis of information cascade.
Most recently, the paper "Influentials, Networks and Public Opinion Formation" by Watts and Dodds presents a strategic shift in the way we think about influence and they help casts the problem in a different manner. Their radical shift in approach is best summarized by the following excerpt from the abstract:
..We find that large cascades of influence are driven not by influentials but by a critical mass of easily influenced individuals.
Through simulation studies they validate this intuition. According to me the major importance of this result is in highlighting the fact that a) it is perhaps the followers that are more important in the analysis and b) we need to focus on what these followers are influenced by? No opinion leader or influential nodes is arbitrarily influential just about everything and for everyone. Dailykos may be influential about politics in a particular community and Gizmodo is influential in the geek community. While Dailykos might have some influence in the political community when it comes to technology, it is quite unlikely to have a significant impact on the geek bloggers about a new device. Additionally, Popularity, Influence and Authority are talked about in a similar fashion. This can be quite problematic and I remember having a great discussion with Matt Hurst while we were at KDD last year.
Recently, I had been think about the idea that every individual in the social network has some sort of a sphere of influence around them. The sphere of Influence of a node can be described as the region around that node that is in some way affected by the node.
One way to describe it would be to identify a region/local neighborhood of the node that has more links within the neighborhood than outside of it.
The approach I was thinking of turned out to have quite a bit of similarity with the paper by Aaron Clauset on "Finding a local community structure in networks". This paper presents a greedy heuristic that starts with the initial node and grows the region around the node by adding one node at a time. Each new node to be added is selected based on a local version of a modularity score (Originally proposed by Girvan and Newman). The local modularity score is calculated as
R = I/T
Where I is the current number of edges that point within the group of nodes discovered by the algorithm thus far (or the local community). T is the set of edges that point to as yet undiscovered edges plus those edges that point within the local community. I ran the above algorithm on the classic Karate club dataset. Following graph shows the local community structures or what I call the "Spheres of Influence" around a node.
The lines in red outline the sphere of influence around the node 4 and the lines in blue represent the sphere of influence around the node 32. In order to determine the boundary or the cut-off for the nodes, I use another measure called conductance of the graph. Conductance measure of the graph is related to random walks in the network. Given a partition of the graph, a low conductance would indicate that the graph is well knit and thus the probability that a random walk starting inside the partition terminates outside it is low. Thus, as the nodes are added we can compute the conductance scores. The following graph shows the conductance values of the subgraph induced by the sphere of influence around a given node. The local minimas correspond to the two levels (one in red and other in darker shade, around node 4).
This is still a relatively simple example and it would be worth exploring the Spheres of Influence in real world social networks like the subscription graph of FriendFeed or the followers in Twitter. Some interesting questions to ponder about:
- How do topics change as you move further away from the influence of the root node.
- Is the probability of receiving "Likes" and "Comments" higher within the sphere of Influence?
- What is the role of connectors and information bridges in such spheres of influence?
- Given the timing of each post, do most cascades occur within the spheres?