Contact Me


  • Akshay Java's Facebook profile

Social Media Events

Friends

Disclaimer

  • Thoughts and comments expressed here are those of the author. Creative Commons License

social media

July 24, 2008

M3SN Workshop on Social Media

Just wanted to share a quick pointer to the First International Workshop on Modeling, Mining and Managing Evolving Social Networks (M3SN) to be co-located with IEEE ICDE 2009 in Shanghai. Following is the CFP via Christian König

I have also added the submission deadlines in the Social Media Calendar (add it to your Google calendar!).

Read this document on Scribd: M3SN Workshop on Social Media

July 20, 2008

Advertising Models: From Contextual to Conceptual/Semantic to Social

Contextual advertising relies on matching an advertisement with a page based on its content. Most often advertisers bid on keywords and the ad platform finds the appropriate pages on which these ads can be displayed by matching keywords and phrases with the content. There have been a number of situations where such an approach may fail. A few real world examples are discussed in the paper by Broder et al. [2]:

a page about a famous golfer named “John Maytag” might trigger an ad for “Maytag dishwashers” since Maytag is a popular brand. Another example could be a page describing the Chevy Tahoe truck (a popular vehicle in US) triggering an ad about “Lake Tahoe vacations”. Polysemy is not the only culprit: there is a (maybe apocryphal) story about a lurid news item about a headless body found in a suitcase triggering an ad for Samsonite luggage! In all these examples the mismatch arises from the fact that the ads are not appropriate for the context.

These examples highlight the need for moving from a contextual to conceptual/semantic advertising models. The paper by Broder suggest mapping the pages as well as ads into a common ontology/taxonomy and thereby finding the appropriate higher level concept (Like politics/sports etc) that relates the two. There are very few papers on advertising models since this is such a closely guarded secret and a search companies' substantial revenue is tied to their ad platform's performance.

I believe that advertising is in it's infancy and more interesting approaches would soon replace the current state of the art. One problem is that due to the lack of datasets it becomes quite difficult to be in academia and make a significant contribution towards this area.

The next avenue for advertising seems to be Social advertising. While Facebook has its own approach to social advertising. I think the general idea of Social Advertising is to utilize not just the context of the page but also the social information to better place the advertisement. One question here is whether the ad placement is done to target the user or his/her audience. These might require slightly different models. For example, if my friends are all clikcing on the iphone ad on a social network, the platform might decide to also target ME personally for the marketing the iPhone. On the other hand if it can identify that a lot of users come to visit my profile due to the social media posts I write -- then perhaps the advertising could target them instead.

One potential market for advertising that I think is completely untapped is the referral. Companies are ready to pay huge sums of money to get new clients. Often cell phone companies, stock trading sites and banks launch promotions where they pay upwards of $50 for referring a friend. But when was the last time you actually did that? I think that is the best way to alienate your friends -- by hoarding corporate America's products and services or spamming their inboxes with unwanted referrals. But still, this is a huge market and worth billions -- if we can crack it! One approach I am thinking of is to build a referral platform (perhaps there are some out there -- I just dont know?) -- one which would benefit publishers and advertisers alike. I as a publisher have a (sort of) general sense of what my audience would like. I can for example even decide that I might be willing to share the $50 I receive from the advertiser and pass on the benefit to my readers (since my payoff is in having the readers come to my blog!) -- thus subsidizing that iPhone you wanted to buy. In the current model, there is'nt much incentive for me to share (a few cents???) /pass on the benefit with the final consumer. But for higher valued products, my guess is that It might just as well work right. Moreover, the referral platform manages the entire process thereby making it easier on the advertiser to launch new schemes and manage their inventory of referral programs.

I had intended to write a brief note on some of the recent papers [1-4] on this topic but turned out sharing my thoughts on the advertising instead -- which is perhaps more fun anyway :-).


[1] http://www.cs.cmu.edu/~deepay/mywww/papers/www08-interaction.pdf
[2] http://portal.acm.org/citation.cfm?id=1277837
[3] www.csulb.edu/web/journals/jecr/issues/20081/Paper1.pdf
[4] http://www2008.org/papers/pp231.html

July 19, 2008

What is the Dunbar's Number for Social Networks?

Many folks are really excited about FriendFeed. Personally, I have found that there are a lot more comments when something gets posted on FriendFeed. Recently Yuval Atzmon's User21 blog released a list of most followed users on FriendFeed. Since I too had a crawl of FriendFeed running in much the same way as Yuval, I decided to look at the complementary question: "How many users do people follow on FriendFeed"? While the crawl is not yet complete (and complete statistics would have to wait), the numbers are really striking! Some users follow more than a 1000 "friends":

sthayden 3190
scobleizer 3087
juliomedina 2760
thomashawk 2557
jasoncalacanis 2447
theillife 2045
mrsth 1961
pookakoo 1814
czarphanguye 1736
brynyoungblut 1716
eposter 1562
susangrisantiguitarist 1550


I find this really amazing. Unlike Twitter, FriendFeed posts are accompanied with longer conversations so it can be more involved. I can barely keep up with all the information flying past me everywhere right now! I guess, 1500+ "friends" would be way too much for me!

Sociologists often talk about the Dunbar's Number which

is the supposed cognitive limit to the number of individuals with whom any one person can maintain stable social relationships.

In human contact network the Dunbar's number is said to be around 150. It might as well be the case that social tools and especially, microblogging is pushing this limit further. Studies on Twitter, Livejournal and other social networking sites seem to support this observation. I wonder then: what would be the Dunbar's number on social networks? 300? 500??? Any guesses? Perhaps some comparison across all the published papers that have studied different social networks might have some clues.

[BTW, I am akshayjava on FriendFeed]

July 17, 2008

Get a Room People!


I've been tinkering with Google's new Virtual World Lively. It seems a bit flaky at the moment and works only on Windows right now... but I was able to set it up. The neat thing about it is that the rooms can be embedded anywhere; like in this blog, for example. Like MyBloglog, Lively can also serve as a visitor log and unlike SecondLife, it isnt a walled garden. It is a fun thing to try out, but IMO, at its utility seems low. What it lacks right now is the stickiness factor...a few of the 2D Facebook app games are more addictive than this. Usually in such situations, an occasional flash mob like the one on Huddle is a lot more fun!

Clustering Triples from Social Data

TagtriplesBy far, the most prevalent data available in social media is tagging information. For example, in del.icio.us a user may tag a URL or in Flickr she may tag an image. One of the questions that comes up is how to then cluster social data that is rich in tags. Some techniques available ignore the user information and use only a bipartite graph consisting of tags and URLs. Another method is to represent two pieces of evidence (user-tag;tag-blog) in a tripartite graph (where nodes are of three different types: users, tags and urls). However, Realtripleseven this type of structure actually  misses the higher order relation between the three nodes. Note that the information available is really in triples of the type <user, tag, url>. This information is not captured by the tripartite graph model. In particular, two users may be connected via a common tag even if the actual URL they bookmarked is vastly different.

There are some techniques using Tensor Matrix Factorization that can handle such data. However, the question of how to deal with triple (or higher) information from social data is quite interesting. Moreover, being able to do so efficiently and in an online fashion would also be important. I believe that this topic may be of significant interest in the upcoming social media and data mining conferences. The implications of these techniques would be in building better recommendation systems and personalization algorithms.

[Thanks Vlad Korolev for some of the discussions related to this post]

July 15, 2008

Google Calendar Feature Requests

Calendar_sm2_en I have started a Google Calendar to keep track of events and conferences in Social Media. Some of you may have already subscribed to it. However there does not seem to be any way for me to tell exactly how many people are using it!

Feature Request #1: Show Number of Subscribers for a Calendar This would certainly be quite a useful feature to have and as it turns out I am not the first person to request for this. 

Feature Request #2: Allow Tagging; Sharing Calendars a User has Added Calendar as a shared resource for planning and organizing events in a community is an important tool. However, the calendar is still not as social as it can be! You can easily find new calendars to add. But what about tags? How about sharing? I would love to be able to create a tag cloud of events and a public list of calendars I have added.

Feature Request #3: Social Event Notification If a user has made his or her events public, then why not show that users friends an update on the event she plans to attend? I think that Dopplr does this at some level but given that Google Calendar is a good place to consolidate all the events and schedules and GMail is our universal contact list -- why not combine it to make it more social?

Anyways, these are just a few quick thoughts I had about Google Calendar. It is a great tool and has made my life much easier. To be fair, I have never used outlook so perhaps my view may be a bit skewed.

SSM 2008 Deadline Extended

The organizers of CIKM 2008 Workshop on Search in Social Media have extended their paper submission deadline from July 20, 2008 to July 27, 2008. For more information please visit the homepage of the workshop. Below is an excerpt from the CFP:

  Social applications are the fastest growing segment of the web. Social media are a fascinating phenomenon because they establish new forums for content creation, allow people to connect to each other and share information, and permit novel applications at the intersection of people and information. However, whereas in the general web search is a critical application that drives usability, social media has been primarily popular for connecting people, not for finding information. While there has been progress on searching particular kinds of social media, such as blogs, search in others (facebook/myspace/flickr) are not as well understood. The purpose of this workshop is to focus the attention of the research community on this emerging topic, and to bring together information retrieval and social media researchers to consider the following questions: How should we search in social media? What are the needs of users, and models of those needs, specific to social media search? What models make the most sense? How does search interact with existing uses of social media? What works and what doesn't?

This workshop is chaired by

Note that the event has been updated in the Social Media Calendar as well. To keep track of social media research conferences and venues please subscribe via the calendar.

Lucky Number 2.0


I wanted to share a quick book recommendation. I picked up Sarah Lacy's recent book "Once You're Lucky. Twice You're Good." for a quick read during my flight.

It is a fantastic account of Web 2.0 and profiles successful social media startups, Silicon Valley entrepreneurs, Venture capitalists and angel investors. The book takes you down a nostalgic road starting with the dotcom bubble all the way to the current buzz around social media and Web 2.0 (and why this is not a bubble!). It is an inspiring story and one that every entrepreneur (/wannabe/fanboy) would love to read. ;-)

But I recommend it mostly due to the nuggets of wisdom in this book. If you are an entrepreneur, there is much to learn from the experiences and insights of those who have spent years building successful companies. Sarah does a great job at taking us through their voyage and writes in a unique and refreshing style of her own.

Indeed, an enjoyable book to read!

July 07, 2008

A Basic Proxy Server Using Google APP Engine

Googleappengine_2.jpeg I had been hoping to find time to check out Google App Engine. It has been widely touted as Google's answer to Amazon EC2 (which I have used a bit recently). Here is a very simple hack -- a basic proxy server built on Google App engine:

http://proxifyit.appspot.com/

I have seen many "Free" anonymization/proxy servers online -- but in my experience, most of them are pretty slow or just crowded with advertisements. This is where Google App Engine (or EC2 for that matter) could be a good platform! It was dead simple to code up this demo by just following the guide and using the URL Fetch API.

My first reaction: Google App Engine is real Fun! It was much easier than using EC2 and the turn around time was unbelievable -- python gurus would love it!

Unfortunately, I am pretty swamped right now but I would like to get to the following tasks to improve it:

  • Handle relative URLs correctly
  • Handle Stylesheets
  • When clicking a new link pass it to proxifyit again
  • Options to diable cookies
  • Options to disable javascript
  • Options to strip out ads
  • Add a captcha if there are too many requests in a given period of time.

Nothing like a quick hack at the end of a busy day... :-)

July 06, 2008

The Cold Start Problem in Social Media

The Cold Start Problem in Social Media

Here is a classic cold start problem:

  • "Social Tools" can only be social if there are enough people on it.
  • And any social site is only as attractive as the number of friends you have on it.
  • The social site is only useful if there are enough people contributing to it. (be it annotating images, links or adding reviews)

The question is how do you get enough users to adopt a tool and build sufficient traction around it such that it attracts more users? I dont have all the answers and perhaps entrepreneurs and folks in startups are more knowledgeable about this than I am. But this is a question I have been pondering about for some time. Most of these points might be fairly obvious, but here are a few thoughts I'd like to share:

  • Above all, build something cool!
  • Realize that in many systems, 1% users acting as contributors is all it takes! (see Clay Shirky's book "Here Comes Everybody" for more on this). Ensure that the reward mechanism is automatically built into the site. These 1% of all users are not the ones that are motivated by money. They use your tools because they enjoy it or it solves some real problem they have been facing. I have seen some sites trying to "pay" users to add data to "seed" their site. For example, check out some of the high paying HITS on Mechanical Turk. In my opinion, this is like throwing money out of the window. Completely bogus way to jump start your site!
  • Provide APIs: One of the key factors that contributed to the success of Twitter was that they had a neat API that developers immediately adopted and had fun building cool toys. These 3rd party tools in turn make it easy for users to contribute and engage with your site, thus breaking the cold start problem. For example, even though not developed by Twitter, the plethora of third party twitter client make it easy to easily update your Tweets. 
  • Try to "seed" your site with datasets curated by web crawls, APIs, external databases or using the tools yourself. For example, if you are building a site that uses geotagging, you might consider using sites like geonames.org or if your site is around movies -- use IMDB or Amazon data to seed it.
  • Make sure stuff is findable and socially visible. Make it easy for your users to find the data they really care about and they will be willing to annotate. Moreover, make sure that it is easy for users to share what they annotate with their friends. The beauty of Facebook is the news feeds. People like to know what their friends are upto. On Twitter, I want to know what my friends are saying and be able to have conversations with them - without that twitter is just a chat room.
  • Dont ever SPAM! Every week I have a bunch of emails coming from some random sites that a friend of mine once joined. Make sure that in your invitation email you include an option to not receive any requests from your site in the future. I am amazed when I dont see this option at all. If a user does not wish to join a network, please dont keep sending them emails requesting them to join every time one of their friends sign up! Also, I am a sucker for alpha/beta testing for any new social media/social network site. Sometimes, I try even the ones that eventually land up spamming everyone on your email/IM. As a RULE -- never spam your potential users! That is the best way to piss them off even before they join it.
  • Listen and iterate rapidly. Your alpha/beta users are the most important. Listen to what they have to say. Also, if you cannot convince your friends and family to use the tool -- why would anyone else bother?

The cold start problem has been studied in computer science, particularly for recommendation systems**. A good place to start is the paper:

Methods and metrics for cold-start recommendations  Schein, r.I.; Popescul, A.; Ungar, L.H.; Pennock, D.M. [Link]

I am quite interested in knowing how startups have approached this problem in real situations and particularly, if there is any analytical data available to show what worked and what did not? I guess this might be information that few would be willing to share so openly.

** on a related note: The blog "Duke Listens" is an excellent source for more on recommendation systems. Also check out the recent post on cold start problem.

Google Ads

Related Wikipedia Entries

Ads

Recent Readers

Search this blog


  • WWW
    socialmedia.typepad.com

July 2008

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    
I Love 6A

Please Support