Monday, December 24, 2007

Papers from WSDM 2008 on click position bias and social bookmark data

The WSDM conference is being held Feb 11-12 at Stanford University. I am not sure I will make it down from Seattle for it, but, if you are in the SF Bay Area and are interested in search and data mining on the Web, it is an easy one to attend.

Most of the papers for the conference do not appear to be publicly available yet, but, of the ones I could find, I wanted to highlight two of them.

Microsoft Researchers Nick Craswell, Onno Zoeter, Michael Taylor and Bill Ramsey wrote "An Experimental Comparison of Click Position-Bias Models" (PDF) for WSDM 2008. The work looks at models for how "the probability of click is influenced by ... [the] position in the results page".

The basic problem here is that just putting a search result high on the page tends to get it more clicks even if that search result is less relevant than ones below it. If you are trying to learn which results are relevant by looking at which ones get the most clicks, you need to model and then attempt to remove the position bias.

The authors conclude that a "cascade model" which assumes "that the user views search results from top to bottom, deciding whether to click each result before moving to the next" most closely fits searcher click behavior when they look at the top of the search results. However, their "baseline model" -- which assumes "users look at all results and consider each on its merits, then decide which results to click" (that is, position does not matter) -- seemed most accurate for items lower in the search results.

The authors say this suggests there may be "two modes of results viewing", one where searchers click the first thing that looks relevant in the top results, but, if they fail to find anything good, they then shift to scanning all the results before clicking anything.

By the way, if you like this paper, don't miss Radlinski & Joachims' work on learning relevance rank from clickstream data. It not only discusses positional bias in click behavior in search results, but also attempts the next and much more ambitious step of optimizing relevance rank by learning from click behavior. The Craswell et al. WSDM 2008 paper does cite some older work by Joachims and Radlinski, but not this fun and more recent KDD 2007 paper.

The second paper I wanted to point out is by Paul Heymann, Georgia Koutrika and Hector Garcia-Molina at Stanford, "Can Social Bookmarks Improve Web Search?" I was not able to find that paper, but a slightly older tech report (PDF) with the same title is available (found via ResourceShelf). This paper looks at whether data from social bookmark sites like del.icio.us can help us improve Web search.

This is a question that has been subject to much speculation over the last few years. On the one hand, social bookmark data may be high quality labels on web pages because they tend to be created by people for their own use (to help with re-finding). On the other hand, manually labeling the Web is a gargantuan task and it is unclear if the data is substantially different than what we can extract automatically.

Unfortunately, as promising as social bookmarking data might seem, the authors conclude that it is not likely to be useful for Web search. While they generally find the data to be of high quality, they say the data only covers about 0.1% of the Web, only a small fraction of those are not already crawled by search engines, and the tags in social bookmarking data almost always are "obvious in context" and "would be discovered by a search engine." Because of this, the social bookmarking data "are unlikely to be numerous enough to impact the crawl ordering of a major search engine, and the tags produced are unlikely to be much more useful than a full text search emphasizing page titles."

Update: It looks like I will be attending the WSDM 2008 conference after all. If you are going, please say hello if you see me!

1 comment:

Anonymous said...

I beg to differ on social bookmarking data. For general search, the findings are very likely to be too. Human tagging just won't scale enough, but for specific, quasi-vertical searches, IMO, it works really well. If I search my del.icio.us network, chosen based on interest, I find that the results are very good. It definitely fits the idea of building search around trust