Contact Me


  • Akshay Java's Facebook profile

Social Media Events

Friends

Disclaimer

  • Thoughts and comments expressed here are those of the author. Creative Commons License

« LiveBlogging Tools: CoveritLive | Main | Guest Post by Blazej Bulka: Social Networking That Went Wrong »

April 28, 2008

Avg Size of a Web Document Compared to A Blog

I read this fascinating article (via Techmeme) that indicates that "the average Web page size has tripled since 2003". IMHO, average is still a problematic measure when dealing with Power-law distributions. I like the example that Clay Shirky mentions in the book "Here Comes Everybody":

If Bill Gates walked into a bar ... we'd all be Millionaires ......... ON AVERAGE!

Same holds when we are talking about web pages. Nevertheless, the study is quite interesting and provides a very good analysis of how Web content is changing.

This made me wonder:
"What is the contribution of Social Media content in tripling the size of an average Web page?"

Blogsweb_2

While not a comprehensive study, I did a very quick back of the envelope experiment to see what this would be like. I fetched the top 400 Web pages, as ranked by Alexa. Similarly, I got a bunch of 400 blogs (from the Buzzmetrics dataset) and cached their homepages as well (wget -p <url>). Following is a graph that compares the sizes of the homepages from the two datasets.

Looks like the size of a blog is "on an average" is larger than the size of a regular Webpage suggesting that a good deal of the  increase in the size of a Webpage could be due to Social Media content.

I think a more detailed  study here would be insightful.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/2871784/28591852

Listed below are links to weblogs that reference Avg Size of a Web Document Compared to A Blog:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

I wonder how much of the blogs size is from optimization issues (images, widgets and such). Have you tried a couple of different blog sets? I would be curious to see if that changes the times.

Disclosure: I work for Nielsen Online formerly BuzzMetrics.

Hi Stephen, my guess is you are absolutely right! Most of the size for blog data might be contributed by widgets and images. Most of these are not highly optimized -- see my own blog for example!!

I do want to try out a couple of other datasets and frankly 400 blogs is hardly a dataset! But I am curious to see results in a larger setting. It would also be worthwhile to examine the avg post size, number of HTML elements and JavaScript usage etc.

It also reminds me of the study "Toward a PeopleWeb", by Raghu Ramakrishnan and Adrew Tomkins. According to their analysis the Social Web contributes 4-5 times more content than professionally edited text.

I hope I can find time to do some more analysis.

Thanks for your comments!

Would it be a reasonable approximation to just grab the feed from several blogs? That would avoid all the cruft outside each post, and you'd be dealing with just the text from the blogs. To be fair, if the content included an image, you could grab that image too...

Just a thought

To an extent, I wanted to measure exactly how 'bloated' a blog homepage is. So if it has many widgets and images -- indeed it is going to be bulky. You bring up an interesting point about the size of a post vs. size of a regular Web page. I can try to run a few scripts and see if I can pull out some numbers from buzzmetrics data set. Thanks for the suggestion, I'm sure it will make another interesting blog post! Gimme a while and I shall try to put out something...

Post a comment

If you have a TypeKey or TypePad account, please Sign In

Google Ads

Related Wikipedia Entries

Ads

Recent Readers

Search this blog


  • WWW
    socialmedia.typepad.com

July 2008

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    
I Love 6A

Please Support