The first thing was to get a solid list of data I could analyze. So I decided to grab the data for May 19th and it looked like this:
Technorati Top 100 | Position 5/19/05 | Links | Sources | Link/Source |
---|---|---|---|---|
Boing Boing | 1 | 22532 | 14623 | 1.5409 |
InstaPundit | 2 | 15190 | 10425 | 1.4571 |
Daily Kos | 3 | 15833 | 9509 | 1.6651 |
Gizmodo | 4 | 12278 | 9259 | 1.3261 |
Fark | 5 | 10216 | 9121 | 1.1201 |
EnGadget | 6 | 15051 | 7869 | 1.9127 |
Davenetics | 7 | 7571 | 7408 | 1.0220 |
Eschaton | 8 | 8713 | 6279 | 1.3876 |
Dooce | 9 | 6797 | 5990 | 1.1347 |
Andrew Sullivan | 10 | 7680 | 5916 | 1.2982 |
The Best Page In The Universe | 11 | 6333 | 5603 | 1.1303 |
Talking Points Memo: by Joshua Micah Marshall | 12 | 7592 | 5581 | 1.3603 |
lgf: anti-idiotarian | 13 | 8275 | 5514 | 1.5007 |
kottke.org | 14 | 7278 | 5483 | 1.3274 |
WIL WHEATON DOT NET | 15 | 6314 | 5368 | 1.1762 |
Metafilter | 16 | 7591 | 5086 | 1.4925 |
Doc Searls | 17 | 5690 | 4947 | 1.1502 |
(In)formacae (In)utilidade | 18 | 6040 | 4934 | 1.2242 |
Wonkette | 19 | 5877 | 4761 | 1.2344 |
Scripting News | 20 | 5728 | 4671 | 1.2263 |
Power Line | 21 | 7477 | 4567 | 1.6372 |
Balmasque | 22 | 4544 | 4504 | 1.0089 |
Corante | 23 | 7686 | 3949 | 1.9463 |
A list Apart | 24 | 5536 | 3946 | 1.4029 |
Something Awful | 25 | 4512 | 3869 | 1.1662 |
Megatokyo | 26 | 4154 | 3828 | 1.0852 |
Michelle Malkin | 27 | 6091 | 3594 | 1.6948 |
Arts and Letters Daily | 28 | 3983 | 3588 | 1.1101 |
Gawker | 29 | 4453 | 3557 | 1.2519 |
Afterall it was the best I ever had | 30 | 3591 | 3517 | 1.0210 |
The Volokh Conspiracy | 31 | 5873 | 3513 | 1.6718 |
Scobelizer | 32 | 5524 | 3429 | 1.6110 |
Jeffrey Zeldman | 33 | 4134 | 3381 | 1.2227 |
This Modern World | 34 | 3913 | 3364 | 1.1632 |
The Web Standards Project | 35 | 3810 | 3281 | 1.1612 |
Joel on Software | 36 | 4514 | 3279 | 1.3766 |
Media Matters for America | 37 | 6809 | 3205 | 2.1245 |
Television without pity | 38 | 3859 | 3193 | 1.2086 |
Kuro5hin | 39 | 4208 | 3135 | 1.3423 |
Lileks | 40 | 3824 | 3118 | 1.2264 |
Hugh Hewitt | 41 | 4573 | 3107 | 1.4718 |
Joel Veitch | 42 | 3774 | 3061 | 1.2329 |
Truthout | 43 | 6528 | 3023 | 2.1594 |
Baghdad Burning | 44 | 3519 | 2985 | 1.1789 |
Buzz machine | 45 | 4145 | 2971 | 1.3952 |
fleugel | 46 | 3670 | 2919 | 1.2573 |
Informed Comment | 47 | 3905 | 2887 | 1.3526 |
Doppler: redefining podcasting | 48 | 3040 | 2848 | 1.0674 |
geek and proud | 49 | 3166 | 2835 | 1.1168 |
loadmemory (Asian site) | 50 | 3324 | 2822 | 1.1779 |
Photojunkie | 51 | 2860 | 2807 | 1.0189 |
Ross Rader | 52 | 2976 | 2736 | 1.0877 |
The Truth Laid Bear | 53 | 4127 | 2735 | 1.5090 |
Joi Ito | 54 | 5165 | 2671 | 1.9337 |
ScrappleFace | 55 | 3480 | 2609 | 1.3338 |
LexText | 56 | 2671 | 2577 | 1.0365 |
Google Blog | 57 | 3688 | 2551 | 1.4457 |
Xbox | 58 | 4221 | 2545 | 1.6585 |
My life in a Bush of Ghosts | 59 | 2519 | 2515 | 1.0016 |
Astronomy picture of the day | 60 | 3498 | 2511 | 1.3931 |
Crooked Timber | 61 | 3617 | 2508 | 1.4422 |
Vodka Pundit | 62 | 3085 | 2358 | 1.3083 |
Captain’s quarter | 63 | 3671 | 2357 | 1.5575 |
A small victory | 64 | 3223 | 2344 | 1.3750 |
Gato Fedorento | 65 | 2574 | 2340 | 1.1000 |
Mezzoblue | 66 | 2952 | 2316 | 1.2746 |
PostSecret | 67 | 2707 | 2310 | 1.1719 |
Samizdata.net | 68 | 2872 | 2270 | 1.2652 |
Lawrence Lessig | 69 | 2949 | 2243 | 1.3148 |
Counterpunch | 70 | 3278 | 2234 | 1.4673 |
Democractic Underground | 71 | 3913 | 2229 | 1.7555 |
Right Wing News | 72 | 2967 | 2215 | 1.3395 |
StopDesign | 73 | 3037 | 2210 | 1.3742 |
iBiblio | 74 | 3105 | 2206 | 1.4075 |
Samizdata.net (mistake?) | 75 | 2743 | 2198 | 1.2480 |
Abrupto | 76 | 2935 | 2186 | 1.3426 |
gene7299 (Asian MSNSpaces site) | 77 | 3215 | 2169 | 1.4822 |
Where is Raed | 78 | 2409 | 2166 | 1.1122 |
B3TA: We love the web | 79 | 2614 | 2140 | 1.2215 |
Talkleft | 80 | 2901 | 2139 | 1.3562 |
Wizbang | 81 | 3358 | 2128 | 1.5780 |
m1net (MSN spaces site) | 82 | 3548 | 2117 | 1.6760 |
Hoder | 83 | 5422 | 2110 | 2.5697 |
CTRL+Alt+Del | 84 | 2315 | 2075 | 1.1157 |
Brad DeLong | 85 | 2715 | 2069 | 1.3122 |
Blogs for Bush | 86 | 3560 | 2036 | 1.7485 |
Neil Gaiman | 87 | 2194 | 2027 | 1.0824 |
Gothamist | 88 | 2729 | 2011 | 1.3570 |
Thought Mechanics | 89 | 2197 | 2010 | 1.0930 |
IMAO | 90 | 2905 | 2006 | 1.4482 |
Dan Gillmor (old weblog) | 91 | 2600 | 2000 | 1.3000 |
HINAGATA | 92 | 2186 | 1978 | 1.1052 |
Dean’s World | 93 | 2985 | 1970 | 1.5152 |
Defamer | 94 | 2372 | 1948 | 1.2177 |
USS Clueless | 95 | 2570 | 1941 | 1.3241 |
Dive into Mark | 96 | 2540 | 1910 | 1.3298 |
Pandagon | 97 | 2822 | 1909 | 1.4783 |
Blogging.la | 98 | 3061 | 1906 | 1.6060 |
Why are you worshipping the ground I blog on? | 99 | 2238 | 1887 | 1.1860 |
Daring Fireball | 100 | 2573 | 1879 | 1.3693 |
Total | 479580 | 350934 | 1.3628 | |
Average | 4795.8 | 3509.34 | 1.3628 | |
Median | 3679.5 | 2814.5 | 1.3267 |
Nothing particularly revealing here, as the data shows things that we already knew. First of all, the top blogs end up getting a lot of links. This is hardly news and it’s clear that the Technorati 100 follow a standard long tail approach. in fact, it’s almost amazing how the data lines up. When you look at the number of sources, you end up with a long tail graph:

However, when you start looking at the links, things get a little funkier. You still get a power law but less so, it seems:

Basically, the tail doesn’t seem to work. Which brings up an interesting question: is there a power law in averages across the board? Are the top blogs getting more links from the same sources on average or do they get around the same amount of links, just from more sources? To answer that, I decided to graph the amount of links per site. It looked like this:

What I find fascinating, is that the A list bloggers, on average do not seem to receive more links from the same sites. They just receive links from more sites. In fact, there seems to be a relative consistency across the board in terms of links per source. If you look back at the chart, you will notice I calculated a few extra values about the set. Your average A list blogger gets about 1.36 links from each source that links to it. However, what’s more interesting is that if you consider the whole set, the median for those top 100 bloggers is 1.33 links per source. So from there we can conclude that the average site links 1.3 time to another blog. In case the A list bloggers, they just happen to be receiving links from more sites (at least as far as Technorati sees the world!)
The clear strategy here is that if you want to climb into the esteemed A-list, you need to get more sites to link to your blog. If you have the same sites linking to your blog on a regular basis, you won’t make it there. This means that the blog world, in a way, is no different from the real world in terms of how popularity is built: one person at a time. However, it could also bring some interesting new insight for bloggers who seek to dominate a niche: it could be argued that niche blogging will get you only so far. If the power laws hold true to niches (and it seems they do), then there is only room for a few people at the top… and if your niche is not one with lots of people in it, forget becoming famous beyond that little niche.
But what does that mean in terms of the wider world? Should we really trust the Technorati numbers? What is the impact of this information? How does this compare to traditional media? Well, that (and more), dear reader, is a subject for future entries so stay tuned to the Secrets of the A-List blogger series for more.