Interestingly, Random Walks and Spectral Clustering are quite closely related. As I dig into the literature, I find this body of work quite fascinating. I had read earlier about this connection between random walks and spectral clustering, however I found a neat paper by Dr. Marina Meila that provides a very clear explanation.
As an example, consider a small dataset of political books. Using Ncut (on the left) there are 3 communities that were identified. Now if we were to perform a random walk on this graph the probability of transitioning from node i to node j would be given by
Pij = Aij /sum(A(:,j));
In other words P = inv(D)*A; where A is the adjacency matrix for the graph and D is a diagonal matrix with degrees of each node. Thus P is a stochastic matrix, with all its rows summing up to 1. The result of which is shown in the graph on the right. Ofcourse, for very large graph the main difficulty is really the computation of the inverse, which can be quite expensive. And that is exactly where the NCut formulation of the problem comes in handy, since it does not involve the inverse. Normalized cut is performed by using the second smallest eigenvector of the graph Laplacian (L = D - A), to partition the nodes of the graph. The interesting thing is that the eigenvectors of L and P are the same and if \lambda is the eigenvalue of L the corresponding eigenvalue for P is (1 -\lambda). For more details refer Dr. Marina Meila's paper. Quite interesting!
Recent Comments