2003-02-16 Google buy Pyra 2 More thoughts: 1 The similarity to Archive.org is intriguing me. Archive is fed by Alexa who are owned by Amazon. Amazon have shit-hot collaborative filtering, but no great web corpus. Google on the other hand have an enormous corpus, but a relatively poor collorative filtering technique. Just perhaps they've realised the value of metadata and nicely marked up data... Okay, so they're converging on the same thing. Ben X Hammersley: hence buying blogger metadirk: indeed Ben X Hammersley: it's a good point - and blogger data is lovely and fresh and link filled metadirk: and not just that metadirk: the metadata for free is there too metadirk: times of posts metadirk: referrers from BlogThis! Ben X Hammersley: author name Ben X Hammersley: author location metadirk: add in blogrolling as a blogger feature Ben X Hammersley: all tasty goodness for a search engine Metadata-for-free is an important one. Imagine Google noticing that the HTTP_REFERER is a particular article. If they haven't got that in Google News, it needs scraping. If there are a lot of referrers, that's important too. Blogrolling is also important, but given that's Google's strength they probably won't need a dedicated app for that. They've already got the PageRank system that should work wonders. Think GoogleDex, sliced and diced any way you want it. (And I would *so* love to see Amazon do the same in a year's time with Alexa's data and the various weblog+book tools that are kicking off now.) 2 Ben Hammersley also points out the web services angle. Given the Blogger API, this also gives them a starting point to start accepting weblogs.com pings. That's the next data injection point. 3 Concerns: - centralisation. Should search even be a central job? Are weblogs becoming intrastructure, and should infrastructure be hosted? It seems a bit odd to have a shared editing environment in a single place, at the expense of people's desktops. - trust. Good weblogs aren't Google's business. - old new media eats new new media. What about LiveJournal, MT, Radio Userland? - It's odd to have Blogger move over the landscape in this way, and I've said before that I don't think weblog publishing systems are architected in the correct way: http://interconnected.org/home/2003_01_12_archive.shtml#90174091 I think maybe what we're seeing is a weird parallel of how email has turned into a distributed side and a centralised side. Blogger is the Hotmail of weblogs. Now the way is good and clear for a weblog system more in line with the way the net really works. 4 Some links: http://weblog.siliconvalley.com/column/dangillmor/archives/000802.shtml http://www.dashes.com/anil/index.php?archives/005129.php http://slashdot.org/article.pl?sid=03/02/16/0728230 5 God that's it. GOOGLE ARE BUILDING THE MEMEX. They've got one-to-one connections. Links. Now they've realised - like Ted Nelson - that the fundamental unit of the web isn't the link, but the trail. And the only place that's online is... weblogs. There are two levels to the trail: 1 - what you see 2 - what you do ("And what you feel on another track" -- what song is that?) And the trail is, in its simplest form, organised chronologically. Later it gets more complex. Look to see Google introduce categories based on DMOZ as a next step. So, the GOOGLE TOOLBAR tracks everything you do on the web, giving you low-level anonymous trails tying the web together. These are analagous to the strings of physics, or the rows and columns of Excel. This is 1, what you see. Now there's the semantics, the meaning extracted from these, and that's done with the human mind. This is 2, what you do. What you choose to elevate. Now these trails are the basic units. The combination of the two is startling. Oh, and you can analyse how people search to add extra data. Stop and start points. Imagine, searching at Google, and then: - this trail is highly followed - do you only want to see what people suggest, or where people went? - here's a worn track in the interweb. Follow the Google Pixie! - this trail is uncommon, but made by someone we see (by your weblog) that you value And next, it's the true Memex. The Google appliance based on microfiche, punchcards and cameras... 6 Last thought, because I think I've reached what's really happening with the Memex. I'm back onto the push (find a goal, push at it, don't interweave with the ecosystem) and pull (grow grow grow, what happens happens. More like nature). Google, Microsoft, etc: these people are Push. Sometimes Push leaps ahead. And sometimes they'll do it really well. But they're grist to the mill for Pull. Pull will happen slow and steady, like BSD appearing under Mac OS X. Sooner or later these Google ideas will distribute, appear as interlocking parts in the net, tied into the ecosystem. At the moment, the trail can only be built centrally, with Push. And it's being done because it's True To The Medium and *has* to be done. But it changes the shape of the environment. There's now an incentive space to encourage building trails. So Pull will come along, build gradually, and in another decade trails will be part of the foundations, and Xanada and the Memex will have won. -- more here: http://interconnected.org/notes/2003/02/Google_buy_Pyra.txt the weblog: http://interconnected.org/home/