In the previous posts I talked about some of the papers I have read recently on spectral clustering and graph based methods. I wanted thought I would provide one more example that would try and explain the use of graph Laplacian. The basic idea is that the graph lies on a 'manifold' which, to completely simplify it, would mean that the graph lies in some multidimensional space.
For example, here is a random dataset that was generated using three Gaussian distributions. In order to obtain a graph from these points, an RBF kernel was used. The corresponding graph Laplacian is then computed from this kernel. When we project the data onto the vectors corresponding to the second and third smallest non-zero eigenvalues , we can see that the three Gaussian distributions are clearly distinguishable. This type of projection is known as Laplacian eigenmap. For more details please refer paper by Niyogi et al.
As an example of how this applies to the community detection / Semi-Supervised graph learning problem, here is a graph of the political books dataset. The corresponding projection is shown below. While most of the conservative and liberal books can be identified some neutral books are a bit ambiguous making them harder to classify correctly.Typically, in case of spectral clustering we would only need to consider the second smallest eigenvector of the graph laplacian. It is also known as the "Fiedler" Vector and is used in graph partitioning.
Please note that this series of blog post on spectral methods is my attempt to share my understanding of the papers I have read recently. I am happy to receive further references and clarifications of the points discussed here.
On using a graph laplacian eigenmaps to project to lower dimensions with the example of three gaussians you mentioned, I have to say that the calculation of an RBF kernel is an unnecessary step. You could directly generate a graph by using a simple euclidean distance metric on the ambient space of the data and then using k-nn to connect the datapoints, or exponential weight decay, or epsilon-round balls ( those are the main heuristic methods for graph construction that come to mind, although the exponential decay one is essentially considering the RBF kernel as a weighted graph).
However using kernels is very useful when the ambient space (feature space) of your data does not have an inherent distance metric (such as the euclidean distance mentioned above). One really neat trick to note in case you want to construct graphs using kernels in distance space is the following distance equivalent.
Remember that the distance in ambient space between points a and b is defined as:
dist(A,B) = sqrt( (A-B)^2)= sqrt (A*A +B*B -2*A*B ) ,
since a kernel space has it's own inner product deffinition under K which is k you can use that to find the distances in hilbert space as defined by a kernel function k(x,y) as the following : =
dist(A,B)= sqrt(A*A +B*B -2AB) = sqrt( k -2k + k ) = sqrt ( k(A,A)+k(B,B) - 2k(A,B)
on the other hand i'm probably just repating things you read on the spectral clusteting papers. love the site btw, it reminds me of a lot of research i used to do in college. Keep up the good work
Posted by: Dimitrios Athanasakis | April 04, 2008 at 09:06 AM