Benjamin
Piwowarski, Andrew Trotman, Mounia
Lalmas
In information
retrieval research, comparing retrieval approaches requires test collections
consisting of documents, user requests and relevance assessments. Obtaining
relevance assessments that are as sound and complete as possible is crucial for
the comparison of retrieval approaches. In XML retrieval, the problem of
obtaining sound and complete relevance assessments is further complicated by
the structural relationships between retrieval results.
A major difference
between XML retrieval and flat document retrieval is that the relevance of
elements (the retrievable units) is not independent of that of related
elements. This has major consequences for the gathering of relevance assessments.
This paper describes investigations into the creation of sound and complete
relevance assessments for the evaluation of content-oriented XML retrieval as
carried out at INEX, the evaluation campaign for XML retrieval. The campaign,
now in its seventh year, has had three substantially different approaches to
gather assessments and has finally settled on a highlighting method for marking
relevant passages within documents – even though the objective is to collect
assessments at element level. The different methods of gathering assessments at
INEX are discussed and contrasted. The highlighting method is shown to be the
most reliable of the methods.