Contact Me


  • Akshay Java's Facebook profile

Social Media Events

My Network

FriendFeed

Disclaimer

  • Thoughts and comments expressed here are those of the author. Creative Commons License

semantic web

September 12, 2008

AAAI-SSS-09 Social Semantic Web: Where Web 2.0 Meets Web 3.0

This year's AAAI Spring Symposium has a session on "Social Semantic Web: Where Web 2.0 Meets Web 3.0". The workshop is organized by Mark Greaves , Li Ding , Jie Bao and Uldis Bojars. From the CFP for this event:

In this symposium, we are interested in bringing together the semantic web community and the social web community to promote the collaborative development and deployment of semantics in the World Wide Web context. We welcome constructive papers on, for example: (i) how semantic technologies, especially knowledge representation and collective intelligence, can benefit social web content organization and retrieval; (ii) how social web technologies can facilitate massive semantic content production; and (iii) how to address the requirements, e.g., reasoning scalability and semantic convergence issues, which emerge from the combination.


The papers are due Oct 1st and the Spring Symposium is held from March 23-25th 2009 at Stanford.

September 09, 2008

Semantic Ecoblogging and Citizen Science

Spire

Today during our weekly eBiquity lab group meetings, Joel Sachs updated us on the Spire project.


Spire is a research project investigating how semantic web technologies can be used to support science in general and the field of ecoinformatics in particular.


During lunch, Joel and I talked about crowdsourcing and how social media and citizen science can help innovation and discovery in scientific applications. Many of the tasks of a field researcher involves tedious and careful logging of the ecology and habitat in an environment. This is exactly where amateur scientists and hobbyists can make a big difference and contribute to science. Most amateurs contribute to such projects, just because they enjoy it and love working on it as a hobby. One example of how crowdsourcing is currently done in Spire is through:

  • Splickr tool: that allows you to query for geotagged photos on Flickr.
  • Spotter tool: "Spotter allows you to easily create RDF/OWL for species observations. It places a button on your Firefox toolbar that calls up a form with fields taken from our observation ontology (Observer, Reporter, Location, Common Name, Scientific Name, etc.)." (description quoted from the site)

While reading the book crowdsourcing, I learned of similar efforts by NASA, and Cornell University's Ornithology lab. Many discoveries in science are now coming from amateur researchers -- ant not just from folks with multi-million dollar grants backing them. A few examples from the book are:

  • InnoCentive many open research problems are being solved by a part-time-hobbyist-amateur type researchers.
  • INSPIRE NASA is harnessing wisdom of the crowds to help analyze images
  • The rediscovery of Cozumel Thrasher, a rare bird species by a bunch of amateur birders.
  • And many more examples .... (I really recommend reading the book - it is well written and you will not be able to put it down!)

During the meeting and later, we were intrigued by the idea of developing an iPhone app to allow researchers and 'citizen scientists' to take pictures of species in the wild and upload them to spire. One important point here is that often an amateur researcher might not be aware of the scientific name of a species, but given we have enough eyes looking at the data being uploaded, we can divide the task of labeling and categorizing the findings. Citizen scientists can contribute at various levels -- performing field studies, annotating, building ontologies, etc.

Since, Spire is a Semantic Web application backed by ontologies, one can now reason over the findings and learn that an invasive species is posing a significant threat to the local habitat. This is the ultimate aim of Spire and to me, this is the killer app for Web 3.0 -- if you really insist on a term like that!

For more information on Spire please refer to the following publications.

June 26, 2008

Evri: Search Less, Understand More

I just received the beta invite to Evri.com (Yaaayy!). It is a really cool site that aims to help people find information. Right now they just have a browse interface. You can see what are the top concepts and named entities (primarily from News sources) and navigate through semantically related terms. The main idea behind their approach is that you can construct the graph of all the concepts and entities by analyzing the text. Here is an example of the top names in the news. Clicking the terms (from the graph) "Barack Obama" and "Ralph Nader" for example, would pull up all the stories related to recent controversies.

Evri One can browse through the graph or the popular terms. I checked out what they found on Obama. Here is a snapshot on the left. I think that a really neat trick that Evri is using is the idea that working on sentence level semantics can provide sufficient meaning to help organize information. Constructing a complete parse tree that is both syntactically and semantically accurate is a difficult problem. There are many vagaries of natural language text that make this challenging. Evri, at least for now, bypasses some of these problems by organizing information around simple questions like "what is Obama doing?" which can have easy to identify clues directly accessible from the text (critisizing, leading, denying, facing....). Similarly for other entities like organizations one can ask "What is happening with Yahoo?" (bidding, reject, acquire, etc.). 

Obama

This is a fascinating approach to organizing information and I think that Evri has a great potential. Lets think about it for a minute. One of my favorite passtime is to go to Wikipedia, pull up a random article and then browse through related articles. It is this serendipity and the feeling of chance discovery of something interesting that is so compelling about Evri.

Evri also reminded me about the way I had hoped to implement SemNews, a semantic search engine, that analyzed RSS snippets of News articles and processed it through OntoSem, an ontological semantics based Natural Langugage Processing system. Once the semantics/meaning representations were extracted, I would store the meanings in an OWL store so that RDQL queries could be performed to find relevant news items. I believe that the way we can accomplish Dr. Tim Berners-Lee's vision of Semantic Web is by advancing both information extraction (web scraping, entity annotation etc) and NLP techniques that would automatically annotate text and make it available in machine readable format.

AdsAlthough, the founder claims that they are not a search engine, they surely join the group of NLP-based startups like Powerset and Hakia. Another powerful tool is Freebase which uses primarily Wikipedia as its source of information. Finally, it is also worth mentioning that Kosmix is yet another startup that aims to "Organize the Web so that you can explore, learn and discover".

The next obvious question that comes up is regarding the monetization and business model of these startups. Ofcourse, the story goes... the information is more focussed so ads would be more relevant... and no surprise that is indeed so TRUE. Just check out some of the advertisements on Evri. On the left, is a screenshot of an advertisement on Barack Obama's info page. But I think there is an opportunity beyond simply relevant advertising!

Many companies have huge websites with lots of information -- some organized and most not quite as much. If you wanted to ensure that your customers are able to get to the exact information they need -- Evri like approach can be ideal to help them browse through the various facets to get to what they really need. The applications to Enterprises and Enterprise search can be another monetization platform for Evri. 

Finally, IMHO, some hurdles that Evri faces could be dealing with noisy text, especially with Social Media. Many approaches that rely on linguistic or gramatical correctness of sentences simply fail miserably when dealing with social media content. The second problem might be esuring coverage of information. Right now, it seems to me like the News soruces Evri relies on are primarily US centric. As they aim to capture more audience outside US as well, they would have to concentrate on foreign languages, disambiguating named entities and location names. These are all interesting research problems and fun stuff to work on!

Sponsors

Ads

Search this blog


  • WWW
    socialmedia.typepad.com

Recent Readers

April 2009

Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30    

Please Support