Thursday, July 15, 2004

Why Blogs Have So Much Googlejuice

From a November 2001 Science magazine article by Jon Kleinberg and Steve Lawrence:

Several studies have shown that the number of links to and from individual pages is distributed according to a power law over many orders of magnitude; the fraction of pages with n in-links is roughly n–α for α ~ 2.1.
In other words, less than 0.8 percent of all webpages have more than 10 inbound links from external sites, and less than 0.2 percent have 20 such links. Add in the fact that links are weighted by source influence, and we're done: a single link from the likes of Glenn Reynolds will likely lead to several other links, many from other influential sources.

Another thing this power law distribution of inbound links, and therefore, influence, suggests, is that attempts by spammers* to game the system by flooding the web with sham websites, all referring to each other, is likely to fail. To get the results they require, they have to by high-ranking sites, and it inevitably follows that popular blogs that allow anonymous commenting will prove irresistable targets for trackback spam and phony comments. As the most popular bloggers seem either unable or simply unwilling to spend much of their free time policing comment sections, I suppose it's just as well that sites like Instapundit and Andrew Sullivan don't permit commenting or trackback pings; the ongoing trend towards commenter registration is more a feelgood measure rather than any sort of panacea for dealing with this issue.

*Or "Search Engine Optimizers" as they prefer to call themselves.