Sometimes you learn about a new mathematical technique that is so intriguing that it can be only described as "beautiful". Nonnegative matrix factorization is one such method that I did not know of until quite recently. The details of the method are available in the paper "Document Clustering Based On Non-negative Matrix Factorization" by Wei Xu, Xin Liu, Yihong Gong.
The basic idea behind this method is that you want to factorize a matrix X into two smaller matrices U and V such that, both U and V are non negative. This is achieved by using minimizing the following optimization function
So if we have a matrix X that represents a Term*Document matrix: it can be factorized into the two matrices U and V such that U signifies the Term*ClusterAssociation and V transpose signifies the ClusterAssociation*Document matrix. Now since the two matrices U and V are non negative, meaning all the elements in them are >= 0, we can identify the cluster to which a document belongs by projecting the vector V onto the dimension with the highest value.
Singular Valued Decomposition(SVD), decomposes X into dense matrices that can contain negative elements and it is not always intuitive what the basis vectors really signify. However using NMF the clusters are readily and directly available from the factorization. In addition, the sparsity makes this technique quite appealing.
In the following example, I have clustered the CLASSIC3 dataset, which is a standard corpus frequently used for evaluating different clustering methods. Notice how the three datasets CISI, MEDLINE and CRANFIELD line up nicely along the three different axis.
I like this method for its simplicity and intuition and have been exploring its use in clustering blog/social data.