I just received the beta invite to Evri.com (Yaaayy!). It is a really cool site that aims to help people find information. Right now they just have a browse interface. You can see what are the top concepts and named entities (primarily from News sources) and navigate through semantically related terms. The main idea behind their approach is that you can construct the graph of all the concepts and entities by analyzing the text. Here is an example of the top names in the news. Clicking the terms (from the graph) "Barack Obama" and "Ralph Nader" for example, would pull up all the stories related to recent controversies.
One can browse through the graph or the popular terms. I checked out what they found on Obama. Here is a snapshot on the left. I think that a really neat trick that Evri is using is the idea that working on sentence level semantics can provide sufficient meaning to help organize information. Constructing a complete parse tree that is both syntactically and semantically accurate is a difficult problem. There are many vagaries of natural language text that make this challenging. Evri, at least for now, bypasses some of these problems by organizing information around simple questions like "what is Obama doing?" which can have easy to identify clues directly accessible from the text (critisizing, leading, denying, facing....). Similarly for other entities like organizations one can ask "What is happening with Yahoo?" (bidding, reject, acquire, etc.).
This is a fascinating approach to organizing information and I think that Evri has a great potential. Lets think about it for a minute. One of my favorite passtime is to go to Wikipedia, pull up a random article and then browse through related articles. It is this serendipity and the feeling of chance discovery of something interesting that is so compelling about Evri.
Evri also reminded me about the way I had hoped to implement SemNews, a semantic search engine, that analyzed RSS snippets of News articles and processed it through OntoSem, an ontological semantics based Natural Langugage Processing system. Once the semantics/meaning representations were extracted, I would store the meanings in an OWL store so that RDQL queries could be performed to find relevant news items. I believe that the way we can accomplish Dr. Tim Berners-Lee's vision of Semantic Web is by advancing both information extraction (web scraping, entity annotation etc) and NLP techniques that would automatically annotate text and make it available in machine readable format.
Although, the founder claims that they are not a search engine, they surely join the group of NLP-based startups like Powerset and Hakia. Another powerful tool is Freebase which uses primarily Wikipedia as its source of information. Finally, it is also worth mentioning that Kosmix is yet another startup that aims to "Organize the Web so that you can explore, learn and discover".
The next obvious question that comes up is regarding the monetization and business model of these startups. Ofcourse, the story goes... the information is more focussed so ads would be more relevant... and no surprise that is indeed so TRUE. Just check out some of the advertisements on Evri. On the left, is a screenshot of an advertisement on Barack Obama's info page. But I think there is an opportunity beyond simply relevant advertising!
Many companies have huge websites with lots of information -- some organized and most not quite as much. If you wanted to ensure that your customers are able to get to the exact information they need -- Evri like approach can be ideal to help them browse through the various facets to get to what they really need. The applications to Enterprises and Enterprise search can be another monetization platform for Evri.
Finally, IMHO, some hurdles that Evri faces could be dealing with noisy text, especially with Social Media. Many approaches that rely on linguistic or gramatical correctness of sentences simply fail miserably when dealing with social media content. The second problem might be esuring coverage of information. Right now, it seems to me like the News soruces Evri relies on are primarily US centric. As they aim to capture more audience outside US as well, they would have to concentrate on foreign languages, disambiguating named entities and location names. These are all interesting research problems and fun stuff to work on!
Recent Comments