Sunday, April 30, 2006

"Manual" is the new "algorithmic"

A recent conversation with a friend about the whole Web 2.0 madness got him to flatter me into pimping my opinion on the subject to the blog. The title of the post was (of course) inspired by a conversation with the other wife.

In the beginning, there were directories (think Yahoo! and the ODP). These were manually populated by some trusted community of people, who made sure that the links pointed to relevant content. Eventually the amount of content on the web grew way beyond the capability of manual discovery, and fairly complicated algorithms crawled and sifted through the mounds of crud to find the data relevant to most queries (think Google).

Once it became obvious that search was an incredibly powerful driving force for web commerce, it wasn't long before an entire community of black-hat search engine optimizers (SEOs) popped up to manipulate the rankings to their advantage. After all, "There was GOLD in them thar SERPs! (Search Engine Result Pages)". Most search engines of course have groups of people dedicated to making sure the ranking algorithms are wise to their tricks.

Fast-forward to 2003(ish), and the pendulum swings back to manual labor. and Flickr introduce this novel concept: Let users "tag" content (URLs and images respectively) with words representative of the content (like "family", "poodle", "jazz"). This works great. Free labelled data! Naively, one could use this as a direct relevance statement. An object tagged with the term "jazz" must obviously be a valid search result for the query "jazz", right? Quite so, but the real power of this turns up if you can generalize the labelling to unlabeled content on the web. That's exactly what machine learning algorithms do. If Yahoo!'s smart, their boffins are using their acquisitions of and Flickr to do exactly that.

From a naive point of view, it would appear that we're done. We've solved the relevance problem if the users themselves tell us what's relevant. Right?

Not quite.

Keep in mind that the only reason index spam wasn't a problem with algorithmic search from 1998 to about 2002 was because it didn't (yet) drive commerce. Once Yahoo! really does start using that label data, and the black-hats catch on that tagging is being used to influence search results, what's going to stop an SEO from tagging affiliate pages for online casinos with "cooking"? Pretty much nothing. At that point the value of the labelled data is zilch. We'll have to resort to natural language techniques for summarization to automatically generate tags. Guess what? That's back to algorithmic information retrieval again.

So that's my $0.02. We're in a temporarily happy phase where "manual is the new algorithmic" (smile Coe). In a couple of years' time we'll be back to where we started. Enjoy it while it lasts.


Thursday, April 27, 2006

"i" is the new "e"

iPod, iMac, iRobot, iGoogle, iCan't really understand what the fascination with this letter is. It's like back in the days of the dot-com boom when people would prepend an "e" all willy-nilly to any idea and instantly receive venture capital.

Now Nintendo's gone and named its next generation console "Wii" (pronounced like "We") since it's a "console for everyone". Ooh-ooh, I have a couple more:
  • iPii: Especially after all those cups of coffii
  • F#$k yii: Suitable invective if yii bii annoyed by superfluous "i"s

Sunday, April 16, 2006

Misplaced in Translation

A recent trip to the motherland (pun unintended until I noticed it) made me realize that growing up in India placed a few words in my vocabulary which --- while rarely used now in the Republic --- bring a nostalgic sniffle:
  • gum-boots: Hideous rubber boots that we wore to school during the monsoon at the age of about 6. Wonderfully watertight, but that only meant that the water dripping into them from the raincoat sticking to your knees had nowhere to go.
  • lift: Elevator. But not the fancy-schmancy stuff with the automatic doors. These are the ones with the collapsible metal grating which you have to drag open and shut. Oh, and if you don't shut it completely, you'll be haunted by the midi-style "Jingle Bells" all your life.
  • "yoo-dee-clone": This mysterious fragrant stuff that my grandmother believed would cure everything from colds to fractures. When I got around to reading, I found out it was eau de cologne.
  • flat: Apartment. Not rented but owned; in a building where everyone knows everyone else and their birthday.
  • chai: Not the strawberry-raspberry-mango flavored foo-foo crap that Starbucks foists. This is the real stuff. Sold at most street corners (milk and sugar included) and strong enough to make you sit up and bark. Perfect during the monsoons while those gum-boots are drying off. If you're in a hurry, you can ask for a "cutting" (half) serving. By the way, do any of my non-desi readers realize that chai means "tea"? So when you ask for "chai tea" (with or without the passion-fruit infusion) you just sound kinda silly?
Anyone else have a list they want to throw in?

Saturday, April 15, 2006


I've noticed that my friends are starting to fall into a couple of categories. Those that we are grateful have absolutely no hope of offspawn, and some others who have simply the most adorable kids ever. I mean, listen to the man. Doesn't he sound positively ga-ga over the little object?

And in case you're wondering (as I know my dearest aging relatives are); no, Jayita and I love our friends' kids because they all have this eminently endearing property: a return policy.


SFJazz Collective

Jayita and I just returned from attending the members-only concert of the SFJazz Collective. Outstanding performances all around, especially from Nicholas Payton (trumpet) and Eric Harland (drums). One of Payton's compositions -- "Sudoku" -- captured the spirit of the game perfectly. The arrangement had a pecking staccato melody, broken up into intervals of two or three notes performed by each instrument, with each leading into and weaving around the other. One immediately felt the hunt-and-peck nature of the game's solution, while the sections which had flurries of simultaneous chords alluded to the cascade of solved cells that usually results from a breakthrough.

The most awesome part of the evening? Getting to meet the band members after the concert. Oh yeah, those are the autographs I got on my CD :)) Jealous yet??

Thursday, April 13, 2006

Google Calendar

I had to keep my trap shut about this for far too long, but now that it's been released, Google Calendar rocks! :))


Thursday, April 06, 2006

New Toy

Picked up this little beast yesterday at Fry's. I'm happy. Very happy. They even gave me a discount to match an online deal that was $100 cheaper. Did I mention I was happy?

Wednesday, April 05, 2006


Meant to post this on the 21st of March, but I (naturally) forgot to carry the USB cable for my camera. Now that I have the picture... Happy 95th birthday, Grandma.