Contact Me


  • Akshay Java's Facebook profile

Social Media Events

Friends

Disclaimer

  • Thoughts and comments expressed here are those of the author. Creative Commons License

web 2.0

July 19, 2008

What is the Dunbar's Number for Social Networks?

Many folks are really excited about FriendFeed. Personally, I have found that there are a lot more comments when something gets posted on FriendFeed. Recently Yuval Atzmon's User21 blog released a list of most followed users on FriendFeed. Since I too had a crawl of FriendFeed running in much the same way as Yuval, I decided to look at the complementary question: "How many users do people follow on FriendFeed"? While the crawl is not yet complete (and complete statistics would have to wait), the numbers are really striking! Some users follow more than a 1000 "friends":

sthayden 3190
scobleizer 3087
juliomedina 2760
thomashawk 2557
jasoncalacanis 2447
theillife 2045
mrsth 1961
pookakoo 1814
czarphanguye 1736
brynyoungblut 1716
eposter 1562
susangrisantiguitarist 1550


I find this really amazing. Unlike Twitter, FriendFeed posts are accompanied with longer conversations so it can be more involved. I can barely keep up with all the information flying past me everywhere right now! I guess, 1500+ "friends" would be way too much for me!

Sociologists often talk about the Dunbar's Number which

is the supposed cognitive limit to the number of individuals with whom any one person can maintain stable social relationships.

In human contact network the Dunbar's number is said to be around 150. It might as well be the case that social tools and especially, microblogging is pushing this limit further. Studies on Twitter, Livejournal and other social networking sites seem to support this observation. I wonder then: what would be the Dunbar's number on social networks? 300? 500??? Any guesses? Perhaps some comparison across all the published papers that have studied different social networks might have some clues.

[BTW, I am akshayjava on FriendFeed]

July 15, 2008

Google Calendar Feature Requests

Calendar_sm2_en I have started a Google Calendar to keep track of events and conferences in Social Media. Some of you may have already subscribed to it. However there does not seem to be any way for me to tell exactly how many people are using it!

Feature Request #1: Show Number of Subscribers for a Calendar This would certainly be quite a useful feature to have and as it turns out I am not the first person to request for this. 

Feature Request #2: Allow Tagging; Sharing Calendars a User has Added Calendar as a shared resource for planning and organizing events in a community is an important tool. However, the calendar is still not as social as it can be! You can easily find new calendars to add. But what about tags? How about sharing? I would love to be able to create a tag cloud of events and a public list of calendars I have added.

Feature Request #3: Social Event Notification If a user has made his or her events public, then why not show that users friends an update on the event she plans to attend? I think that Dopplr does this at some level but given that Google Calendar is a good place to consolidate all the events and schedules and GMail is our universal contact list -- why not combine it to make it more social?

Anyways, these are just a few quick thoughts I had about Google Calendar. It is a great tool and has made my life much easier. To be fair, I have never used outlook so perhaps my view may be a bit skewed.

Lucky Number 2.0


I wanted to share a quick book recommendation. I picked up Sarah Lacy's recent book "Once You're Lucky. Twice You're Good." for a quick read during my flight.

It is a fantastic account of Web 2.0 and profiles successful social media startups, Silicon Valley entrepreneurs, Venture capitalists and angel investors. The book takes you down a nostalgic road starting with the dotcom bubble all the way to the current buzz around social media and Web 2.0 (and why this is not a bubble!). It is an inspiring story and one that every entrepreneur (/wannabe/fanboy) would love to read. ;-)

But I recommend it mostly due to the nuggets of wisdom in this book. If you are an entrepreneur, there is much to learn from the experiences and insights of those who have spent years building successful companies. Sarah does a great job at taking us through their voyage and writes in a unique and refreshing style of her own.

Indeed, an enjoyable book to read!

July 06, 2008

The Cold Start Problem in Social Media

The Cold Start Problem in Social Media

Here is a classic cold start problem:

  • "Social Tools" can only be social if there are enough people on it.
  • And any social site is only as attractive as the number of friends you have on it.
  • The social site is only useful if there are enough people contributing to it. (be it annotating images, links or adding reviews)

The question is how do you get enough users to adopt a tool and build sufficient traction around it such that it attracts more users? I dont have all the answers and perhaps entrepreneurs and folks in startups are more knowledgeable about this than I am. But this is a question I have been pondering about for some time. Most of these points might be fairly obvious, but here are a few thoughts I'd like to share:

  • Above all, build something cool!
  • Realize that in many systems, 1% users acting as contributors is all it takes! (see Clay Shirky's book "Here Comes Everybody" for more on this). Ensure that the reward mechanism is automatically built into the site. These 1% of all users are not the ones that are motivated by money. They use your tools because they enjoy it or it solves some real problem they have been facing. I have seen some sites trying to "pay" users to add data to "seed" their site. For example, check out some of the high paying HITS on Mechanical Turk. In my opinion, this is like throwing money out of the window. Completely bogus way to jump start your site!
  • Provide APIs: One of the key factors that contributed to the success of Twitter was that they had a neat API that developers immediately adopted and had fun building cool toys. These 3rd party tools in turn make it easy for users to contribute and engage with your site, thus breaking the cold start problem. For example, even though not developed by Twitter, the plethora of third party twitter client make it easy to easily update your Tweets. 
  • Try to "seed" your site with datasets curated by web crawls, APIs, external databases or using the tools yourself. For example, if you are building a site that uses geotagging, you might consider using sites like geonames.org or if your site is around movies -- use IMDB or Amazon data to seed it.
  • Make sure stuff is findable and socially visible. Make it easy for your users to find the data they really care about and they will be willing to annotate. Moreover, make sure that it is easy for users to share what they annotate with their friends. The beauty of Facebook is the news feeds. People like to know what their friends are upto. On Twitter, I want to know what my friends are saying and be able to have conversations with them - without that twitter is just a chat room.
  • Dont ever SPAM! Every week I have a bunch of emails coming from some random sites that a friend of mine once joined. Make sure that in your invitation email you include an option to not receive any requests from your site in the future. I am amazed when I dont see this option at all. If a user does not wish to join a network, please dont keep sending them emails requesting them to join every time one of their friends sign up! Also, I am a sucker for alpha/beta testing for any new social media/social network site. Sometimes, I try even the ones that eventually land up spamming everyone on your email/IM. As a RULE -- never spam your potential users! That is the best way to piss them off even before they join it.
  • Listen and iterate rapidly. Your alpha/beta users are the most important. Listen to what they have to say. Also, if you cannot convince your friends and family to use the tool -- why would anyone else bother?

The cold start problem has been studied in computer science, particularly for recommendation systems**. A good place to start is the paper:

Methods and metrics for cold-start recommendations  Schein, r.I.; Popescul, A.; Ungar, L.H.; Pennock, D.M. [Link]

I am quite interested in knowing how startups have approached this problem in real situations and particularly, if there is any analytical data available to show what worked and what did not? I guess this might be information that few would be willing to share so openly.

** on a related note: The blog "Duke Listens" is an excellent source for more on recommendation systems. Also check out the recent post on cold start problem.

July 01, 2008

Gmail as a Universal Contact List

Stop UPDATE2: Great News! Right while I had been thinking about this issue yesterday, looks like Google released its official AJAX Client library for its Contact API. So after all Google is becoming a universal contact list? Now, with these tools and APIs available, third party sites have no excuse whatsoever for continuing to insist on asking for username and password to import contacts!

UPDATE1: As it turns out, I totally forgot about the recent announcement of Google Friend Connect and the controversy that soon followed. This is the kind of approach I was thinking of just that it slipped my mind while writing this post late into the night. I think I had signed up for the private beta as well and am awaiting an invitation. Here is the video that explains Google Friend Connect.

I hope with Google, Facebook and Myspace all trying to solve this problem,  third party apps trying to import contact list using password/credentials directly will soon be a thing of the past.

-----------------------------------------------------------------------------

Gmail has almost become a universal contact list. Atleast all social network sites think it is so..

I just dont understand why every time I am asked for my gmail user ID and password (to find friends on a network) I cringe but then finally give in -- only to get burnt, burnt and burnt (ouch!) What drives me nuts is when some of these sites get away with sending your password in plain text! Why do we put up with this nonsense, in this day and age?

One suggestion I have for this problem is to build a Gmail Friend Finder API that would allow Yet Another Social Network (YASN) to access our universal contact list. What I mean by this is: Gmail knows everyone I know and interact with. I trust Gmail and am generally more willing to let Gmail be the arbiter of my social information. Why is this a good idea? for starters third party apps neednt ask users for their password. I just ask them to go and talk to gmail to see if there are others in their site whom I might know and might be interested in connecting with me.

Yesss! I am aware of OpenID and Social Graph API. Here is a small glitch, though. Social Graph API relies on FOAF/XFN and not everyone has that information published online. OpenID is more for authentication and IMHO, its kinda unintuitive and difficult to explain even to tech savvy folks -- let alone my grandmother! Gmail on the other hand... everyone has an account there and we all 'get it'! To be fair here.. Microsoft passport account in some sense was a precursor to all this, perhaps even a little too early for its time!

Following is an illustrative example of how I see this working:

The approach that I think might work better would involve developing an API for Gmail. When I first join YASN, instead of sending me an email directly, it outsources the verification process to the Gmail API. Gmail sends me an email to verify that it was actually me who signed up on YASN. Once I confirm, it sends YASN a confirmation that it has verified it is me. In addition it sends a secret identifier that it requres YASN to send over SSL when asking for any of my data. Note that at this point Gmail already knows for certain that I am a member on YASN. Now, I want to check if any of my friends are on YASN. So YASN will connect once again with Gmail friend finder along with the token/secret code that was sent to it when I completed the email verification. Now the only friends that Gmail API sends to YASN are the ones who are connected to me on Gmail AND are also members of YASN.

Since YASN can only access limited information via the Friend Finder API, it cannot spam everyone on my email account. Additionally, since it does not have my password, it minimizes the risks of my account being hacked or YASN doing something malicious. Ofcourse all this is just conceptual -- unless Google/Gmail team actually implements some such API.

[Thanks Audumbar Chormale, for the discussions and the question that led to this post]

June 26, 2008

Evri: Search Less, Understand More

I just received the beta invite to Evri.com (Yaaayy!). It is a really cool site that aims to help people find information. Right now they just have a browse interface. You can see what are the top concepts and named entities (primarily from News sources) and navigate through semantically related terms. The main idea behind their approach is that you can construct the graph of all the concepts and entities by analyzing the text. Here is an example of the top names in the news. Clicking the terms (from the graph) "Barack Obama" and "Ralph Nader" for example, would pull up all the stories related to recent controversies.

Evri One can browse through the graph or the popular terms. I checked out what they found on Obama. Here is a snapshot on the left. I think that a really neat trick that Evri is using is the idea that working on sentence level semantics can provide sufficient meaning to help organize information. Constructing a complete parse tree that is both syntactically and semantically accurate is a difficult problem. There are many vagaries of natural language text that make this challenging. Evri, at least for now, bypasses some of these problems by organizing information around simple questions like "what is Obama doing?" which can have easy to identify clues directly accessible from the text (critisizing, leading, denying, facing....). Similarly for other entities like organizations one can ask "What is happening with Yahoo?" (bidding, reject, acquire, etc.). 

Obama

This is a fascinating approach to organizing information and I think that Evri has a great potential. Lets think about it for a minute. One of my favorite passtime is to go to Wikipedia, pull up a random article and then browse through related articles. It is this serendipity and the feeling of chance discovery of something interesting that is so compelling about Evri.

Evri also reminded me about the way I had hoped to implement SemNews, a semantic search engine, that analyzed RSS snippets of News articles and processed it through OntoSem, an ontological semantics based Natural Langugage Processing system. Once the semantics/meaning representations were extracted, I would store the meanings in an OWL store so that RDQL queries could be performed to find relevant news items. I believe that the way we can accomplish Dr. Tim Berners-Lee's vision of Semantic Web is by advancing both information extraction (web scraping, entity annotation etc) and NLP techniques that would automatically annotate text and make it available in machine readable format.

AdsAlthough, the founder claims that they are not a search engine, they surely join the group of NLP-based startups like Powerset and Hakia. Another powerful tool is Freebase which uses primarily Wikipedia as its source of information. Finally, it is also worth mentioning that Kosmix is yet another startup that aims to "Organize the Web so that you can explore, learn and discover".

The next obvious question that comes up is regarding the monetization and business model of these startups. Ofcourse, the story goes... the information is more focussed so ads would be more relevant... and no surprise that is indeed so TRUE. Just check out some of the advertisements on Evri. On the left, is a screenshot of an advertisement on Barack Obama's info page. But I think there is an opportunity beyond simply relevant advertising!

Many companies have huge websites with lots of information -- some organized and most not quite as much. If you wanted to ensure that your customers are able to get to the exact information they need -- Evri like approach can be ideal to help them browse through the various facets to get to what they really need. The applications to Enterprises and Enterprise search can be another monetization platform for Evri. 

Finally, IMHO, some hurdles that Evri faces could be dealing with noisy text, especially with Social Media. Many approaches that rely on linguistic or gramatical correctness of sentences simply fail miserably when dealing with social media content. The second problem might be esuring coverage of information. Right now, it seems to me like the News soruces Evri relies on are primarily US centric. As they aim to capture more audience outside US as well, they would have to concentrate on foreign languages, disambiguating named entities and location names. These are all interesting research problems and fun stuff to work on!

June 20, 2008

Some things are just Semi-Social

Social Media is a lot about sharing. Prior to the growth of social software, it wasn't that people did not share stuff -- they just did it offline or via email. Now we share at a massive scale and a lot more easily. 

Some things we are willing to share "openly"

  • Music playlists (Last.fm)
  • Books we read (iread, shelfari)
  • Calendars and Travel plans (google calendar)
  • Status updates (via Twitter and Microblogging)
  • Restaurant recommendations (yelp)
  • Knowledge and expertise (via Wikipedia)

As we start to experiment with social software we realize that sharing is good and soon become open to sharing a lot more. There are some things though, that just seem semi-social. What I mean by Semi-Social is roughly "Thing I would not mind sharing with a small group of trusted friends and family members".

Until just a few years back there would have been a lot more people squirming if they were asked to share such 'sensitive data' with others. I see this perception slowly eroding away. There is a small, albeit enthusiastic bunch experimenting with new tools that fall into the category of Semi-Social. 

Some cases that I can think of are as follows:

  • Investment portfolio: One example is Covestor. I have an account there but it is under pseudonym. I would not be that enthusiastic to reveal my pathetic attempt to bet on the stock market by watching (mostly tech) blogs. sigh!
  • TV watching habits: I think Television as we know it today is completely broken. There is no social aspect to it whatsoever. At ICWSM, Noor Ali-Hassan presented a paper on "Social Media Scenarios for Television". What struck me about this talk was her statement that "Despite its social nature, there is a private aspect of TV that people want to preserve".
  • Income and financial information: This is something we had least anticipated. How did we get to a point where I am actually not that scared while putting all my bank details and credit card information into a site like Mint? Mint is not a social site as such. But it reflects how we are now willing to part with some really sensitive data. In contrast, there are other examples of recruitment sites like SimplyHired where people reveal their salary information and can search for companies by salary. A more recent startup that is quite similar is Glassdoor.
  • Location: Location can be an extremely sensitive piece of information. Fortunately, Yahoo's fireeagle provides access control for various applications and one can set the privilege that each app has to access location information (latlong, zip, state, country etc).

There will always be some who are at the extreme end of the spectrum and are quite comfortable with being completely (publicly) transparent about "sensitive data". However, most would still only dare to share some of this data with close friends and select people -- i.e. if there is enough value proposition in it for them. Some would be comfortable with aggregate analysis over the data as long as they are not personally identified or targeted in some way (advertising or otherwise).

Although it requires a great deal of courage (to work with privacy sensitive data), the opportunity to invent in the semi-social space may be quite a bit.

June 19, 2008

Email Interview: Nihaar Gupta, Youlicit

Nihaar Gupta, VP of Product development at Youlicit has kindly obliged to have an email interview with SocialMedia Research Blog. Following are the responses to some of the questions I had for him:

1) Please describe Youlicit to us?

Youlicit at its core is a discovery engine (http://blog.youlicit.com/?p=23). We want to connect you to the most relevant and recommended information as effortlessly as possible. As of now, we are building a technology that allows a user to find the most recommended sites (recommended by people around the web) related to a given URL. We believe that people are the best judges of content and more often than not, the information you are looking for has been found by someone before. Our goal is to aggregate that information and allow the user to access it with the click of a button.


2) Tell Us about your background, the team behind Youlicit and how started it?

Youlicit came about as a result of trying to solve our own frustrations with trying to find information on the web. With the enormous amount of user-generated content and annotations on the web, we saw a huge amount of valuable data that was inaccessible and fragmented. For the sake of brevity, the team bios & background are here http://blog.youlicit.com/?page_id=6


3) Please give us a brief overview of the technology behind Youlicit?

Youlicit aggregates user annotations of websites and other user generated content and analyzes it to create a URL-URL mapping of websites based on relevance and quality. Using this mapping we are able to deliver related and recommended sites to a user with a click of a button.


4) While using Youlicit plugin, I felt that one of the challenges is the coverage -- how do you plan to address this and build your current index?

We are constantly working on improving our coverage. There are two metrics we strive to maximize for our results, quality and relevance. In regards to quality, we’re always looking to increase our database of “quality sites” by tapping into the various kinds of user annotations (denoting quality content) that exist on the web (bookmarks, tags, votes, comments). In regards to relevance, we’re always researching novel ways to extrapolate connections between websites and map URL’s back to our database of “quality sites”.


5) How do you ranking the 'Enhanced Links' in the plugin? Do you also take into account how many users actually click through the suggested links?

Each result in the Youlicit More widget (and on Youlicit’s site) has a score based on the metrics above, quality of the site and relevance to the item being queried. We are looking into ways of scoring the results from implicit/explicit feedback that we get from users (clicks, recommends).


6) How do you ensure that the Enhanced links feature is non-intrusive?

The current version has manifested itself after a few weeks of alpha testing with a handful of bloggers. That said, we are still looking for feedback on the user interface and would love to hear opinions on how to make it more useful and less intrusive for bloggers/blog readers.


7) How would you compare the plugin to sphere's related blog posts?

While Sphere focuses related & recent blogosphere content, we, at Youlicit, are trying to provide the blog reader with more seminal information related to the blogger’s topic of conversation. For instance, if you are reading a blog entry on global warming, you are more likely to receive the most recommended articles (blogs, sites, essays) on Global warming from around the web rather than  recent blog entries on that topic.


8) What are the other features on Youlicit?

Youlicit’s primary product is a Firefox extension to access that allows a user to access our results during his/her browsing experience. We are in the process of redesigning our website and streamlining the current offering to focus on this button. Down the road we would like to be able to deliver personalized recommendations for users as well as connect users to people based on transient and long-terms interests (ideally using a person’s interests to enhance his/her social graph).


9) Would the plugin be adverting supported?

We do see advertising as a very possible source of monetization. Given the fact that we are providing contextually relevant information, the search model of advertising applies nicely. We are also exploring other possible means of monetization but as of right now the priority is to build something that people find useful.


10) What are the next things to look out for at Youlicit?

As I mentioned above, we are stripping down Youlicit to bring the focus back to its core; the Youlicit More functionality via the Firefox extension and blog widget. We expect to release a new designed website very soon. And as always, we love to hear feedback on what you think so far and how we can improve.


Youlicit: Search Less and Find More!

Youlicit Youlicit is a new tool that helps you "Search less and find more". Often we forget that search is only one means to find what we are looking for. Even search by itself is not the endpoint of an information need or a query. This tool reminds me of the "berry picking model" of Information Retrieval that I had read about first in my IR Class. The model basically says that:

Information need is not satisfied by a single set of documents but by bits and pieces found along the way.

The paper  titled "The Design of Browsing and Berrypciking Technique for Online Search Interface" describes a searcher as

Moving through many actions towards a general goal of satisfactory completion of research related to an information need.

What Youlicit does is provide this ability implicitly, without the reader (or more generally a searcher) having to go through the trouble of navigating and mentally processing through hyperlinks or firing search queries to find related content. Youlicit takes care of all that on your behalf. By providing a simple plugin, the Youlicit widget automatically highlights some of the related, relevant links and provides useful suggestions -- all without your audience ever leaving your blog. I love the idea and the neat implementation that these guys have built. (The very same need was what lead me to hack this Wikipedia related widget a few weeks earlier.)

On the Youlicit site, you have lots more interesting tools. You can discover new content that is relevant to your interests, find related users and share links with them or follow their interests. Youlicit is paving the way for social browsing tools and is a neat concept that is well implemented. Their index does not seem to be very large at the moment and I feel that it would get better as they start to seriously scale up. In the interim, I feel that there might be stopgap solutions that they could be employ -- for example the Alexa related URLs for the links that are not currently in Youlicit's index.

In relation to this plugin, one tool that is similar is the  Sphere plugin that shows related blog posts. I feel that sphere serves a complementary need. From what I understand Youlicit aims to find the interesting blogs and Web URLs one might want to look into in relation to a given hyperlink.

Another plugin is the Snap plugin -- which shows a screenshot of the outlink. However, in my opinion snap does not really serve much purpose and is a bad tool from a usability perspective.

Youlicit is non-intrusive and you are gonna enjoy the serendipity of finding interesting new links! Give it a spin!

May 30, 2008

My First FireEagle App: Pizza Coupons Search

FireaglePizzaYesterday, I discussed an idea around the FireEagle geolocation API. I was envisioning an app where you could have a mobile phone and as you walk down the Mall or any location, it would pre-fetch relevant coupons and offers from the local restaurants. Being a grad student, we always learn to find good Pizza deals online. So I decided to use the FireEagle API to develop a Pizza  coupon finder. The way it works is that it authenticates with FireEagle to access your current location and then fetches the coupons from Google Maps and then parses the output to display on your mobile phone or a browser. You can try it at the following URL  http://wikimatix.com/coupon/pizza.php
if you have a FireEagle account already. First the application will try to authenticate with FireEagle and request the appropriate permission to access the exact or approximate location information and then passes this to the Google Coupon Finder.

Finally you have all the coupons you need to order your fresh pizza. The Documentation and example walkthrough code on FireEagle's developer area is excellent. It took hardly any time to put together this demo!

I think that the possibilities that this opens up for mobile advertising are exciting. We should also keep an eye on Android -- this space is gonna be fun to watch. [Update: Fixed the broken link. Sorry]

Google Ads

Related Wikipedia Entries

Ads

Recent Readers

Search this blog


  • WWW
    socialmedia.typepad.com

July 2008

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    
I Love 6A

Please Support