TY - CONF TI - A Comparative Analysis of Cascade Measures for Novelty and Diversity AU - Clarke, Charles L.A. AU - Craswell, Nick AU - Soboroff, Ian AU - Ashkan, Azin T3 - WSDM '11 AB - Traditional editorial effectiveness measures, such as nDCG, remain standard for Web search evaluation. Unfortunately, these traditional measures can inappropriately reward redundant information and can fail to reflect the broad range of user needs that can underlie a Web query. To address these deficiencies, several researchers have recently proposed effectiveness measures for novelty and diversity. Many of these measures are based on simple cascade models of user behavior, which operate by considering the relationship between successive elements of a result list. The properties of these measures are still poorly understood, and it is not clear from prior research that they work as intended. In this paper we examine the properties and performance of cascade measures with the goal of validating them as tools for measuring effectiveness. We explore their commonalities and differences, placing them in a unified framework; we discuss their theoretical difficulties and limitations, and compare the measures experimentally, contrasting them against traditional measures and against other approaches to measuring novelty. Data collected by the TREC 2009 Web Track is used as the basis for our experimental comparison. Our results indicate that these measures reward systems that achieve an balance between novelty and overall precision in their result lists, as intended. Nonetheless, other measures provide insights not captured by the cascade measures, and we suggest that future evaluation efforts continue to report a variety of measures. C1 - New York, NY, USA C3 - Proceedings of the Fourth ACM International Conference on Web Search and Data Mining DA - 2011/// PY - 2011 DO - 10.1145/1935826.1935847 DP - ACM Digital Library SP - 75 EP - 84 LA - en PB - ACM SN - 978-1-4503-0493-1 UR - http://doi.acm.org/10.1145/1935826.1935847 Y2 - 2019/01/27/21:34:38 ER - TY - CONF TI - Novelty and Diversity in Information Retrieval Evaluation AU - Clarke, Charles L.A. AU - Kolla, Maheedhar AU - Cormack, Gordon V. AU - Vechtomova, Olga AU - Ashkan, Azin AU - Büttcher, Stefan AU - MacKinnon, Ian T3 - SIGIR '08 AB - Evaluation measures act as objective functions to be optimized by information retrieval systems. Such objective functions must accurately reflect user requirements, particularly when tuning IR systems and learning ranking functions. Ambiguity in queries and redundancy in retrieved documents are poorly reflected by current evaluation measures. In this paper, we present a framework for evaluation that systematically rewards novelty and diversity. We develop this framework into a specific evaluation measure, based on cumulative gain. We demonstrate the feasibility of our approach using a test collection based on the TREC question answering track. C1 - New York, NY, USA C3 - Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval DA - 2008/// PY - 2008 DO - 10.1145/1390334.1390446 DP - ACM Digital Library SP - 659 EP - 666 LA - en PB - ACM SN - 978-1-60558-164-4 UR - http://doi.acm.org/10.1145/1390334.1390446 Y2 - 2019/01/27/19:15:02 ER -