TY - CONF TI - Hierarchical Faceted Metadata in Site Search Interfaces AU - English, Jennifer AU - Hearst, Marti AU - Sinha, Rashmi AU - Swearingen, Kirsten AU - Yee, Ka-Ping T3 - CHI EA '02 AB - One of the most pressing usability issues in the design of large web sites is that of the organization of search results. A previous study on a moderate-sized web site indicated that users understood and preferred dynamically organized faceted metadata over standard search. We are now examining how to scale this approach to very large collections, since it is difficult to present hierarchical faceted metadata in a manner appealing and understandable to general users. We have iteratively designed and tested interfaces that address these design challenges; the most recent version is receiving enthusiastic responses in ongoing usability studies. C1 - New York, NY, USA C3 - CHI '02 Extended Abstracts on Human Factors in Computing Systems DA - 2002/// PY - 2002 DO - 10.1145/506443.506517 DP - ACM Digital Library SP - 628 EP - 639 LA - en PB - ACM SN - 978-1-58113-454-4 UR - http://doi.acm.org/10.1145/506443.506517 Y2 - 2018/07/06/01:46:11 ER - TY - CONF TI - Optimizing search engines using clickthrough data AU - Joachims, Thorsten T2 - KDD '02 AB - This paper presents an approach to automatically optimizing the retrieval quality of search engines using clickthrough data. Intuitively, a good information retrieval system should present relevant documents high in the ranking, with less relevant documents following below. While previous approaches to learning retrieval functions from examples exist, they typically require training data generated from relevance judgments by experts. This makes them difficult and expensive to apply. The goal of this paper is to develop a method that utilizes clickthrough data for training, namely the query-log of the search engine in connection with the log of links the users clicked on in the presented ranking. Such clickthrough data is available in abundance and can be recorded at very low cost. Taking a Support Vector Machine (SVM) approach, this paper presents a method for learning retrieval functions. From a theoretical perspective, this method is shown to be well-founded in a risk minimization framework. Furthermore, it is shown to be feasible even for large sets of queries and features. The theoretical results are verified in a controlled experiment. It shows that the method can effectively adapt the retrieval function of a meta-search engine to a particular group of users, outperforming Google in terms of retrieval quality after only a couple of hundred training examples. C1 - Edmonton, Alberta, Canada C3 - Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining DA - 2002/07/23/ PY - 2002 DO - 10.1145/775047.775067 DP - dl.acm.org SP - 133 EP - 142 LA - en PB - ACM SN - 978-1-58113-567-1 UR - http://dl.acm.org/citation.cfm?id=775047.775067 Y2 - 2019/01/18/20:54:23 ER - TY - CONF TI - Faceted Metadata for Image Search and Browsing AU - Yee, Ka-Ping AU - Swearingen, Kirsten AU - Li, Kevin AU - Hearst, Marti T3 - CHI '03 AB - There are currently two dominant interface types for searching and browsing large image collections: keyword-based search, and searching by overall similarity to sample images. We present an alternative based on enabling users to navigate along conceptual dimensions that describe the images. The interface makes use of hierarchical faceted metadata and dynamically generated query previews. A usability study, in which 32 art history students explored a collection of 35,000 fine arts images, compares this approach to a standard image search interface. Despite the unfamiliarity and power of the interface (attributes that often lead to rejection of new search interfaces), the study results show that 90% of the participants preferred the metadata approach overall, 97% said that it helped them learn more about the collection, 75% found it more flexible, and 72% found it easier to use than a standard baseline system. These results indicate that a category-based approach is a successful way to provide access to image collections. C1 - New York, NY, USA C3 - Proceedings of the SIGCHI Conference on Human Factors in Computing Systems DA - 2003/// PY - 2003 DO - 10.1145/642611.642681 DP - ACM Digital Library SP - 401 EP - 408 LA - en PB - ACM SN - 978-1-58113-630-2 UR - http://doi.acm.org/10.1145/642611.642681 Y2 - 2018/08/09/19:17:02 ER - TY - CONF TI - Accurately Interpreting Clickthrough Data As Implicit Feedback AU - Joachims, Thorsten AU - Granka, Laura AU - Pan, Bing AU - Hembrooke, Helene AU - Gay, Geri T2 - SIGIR'05 AB - This paper examines the reliability of implicit feedback generated from clickthrough data in WWW search. Analyzing the users' decision process using eyetracking and comparing implicit feedback against manual relevance judgments, we conclude that clicks are informative but biased. While this makes the interpretation of clicks as absolute relevance judgments difficult, we show that relative preferences derived from clicks are reasonably accurate on average. C3 - Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, 2005 DA - 2005/// PY - 2005 DP - ACM Digital Library SP - 154 EP - 161 LA - en Y2 - 2019/01/18/20:45:44 ER - TY - CONF TI - Improving Web Search Ranking by Incorporating User Behavior Information AU - Agichtein, Eugene AU - Brill, Eric AU - Dumais, Susan T3 - SIGIR '06 AB - We show that incorporating user behavior data can significantly improve ordering of top results in real web search setting. We examine alternatives for incorporating feedback into the ranking process and explore the contributions of user feedback compared to other common web search features. We report results of a large scale evaluation over 3,000 queries and 12 million user interactions with a popular web search engine. We show that incorporating implicit feedback can augment other features, improving the accuracy of a competitive web search ranking algorithms by as much as 31% relative to the original performance. C1 - New York, NY, USA C3 - Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval DA - 2006/// PY - 2006 DO - 10.1145/1148170.1148177 DP - ACM Digital Library SP - 19 EP - 26 LA - en PB - ACM SN - 978-1-59593-369-0 UR - http://doi.acm.org/10.1145/1148170.1148177 Y2 - 2019/01/18/18:14:12 ER - TY - CONF TI - Design recommendations for hierarchical faceted search interfaces AU - Hearst, Marti AB - This paper presents interface design recommendations for faceted navigation systems, based on 13 years of experience in experimenting with and evaluating such designs. C3 - ACM SIGIR workshop on faceted search DA - 2006/08// PY - 2006 SP - 1 EP - 5 LA - en PB - Seattle, WA ER - TY - CONF TI - Novelty and Diversity in Information Retrieval Evaluation AU - Clarke, Charles L.A. AU - Kolla, Maheedhar AU - Cormack, Gordon V. AU - Vechtomova, Olga AU - Ashkan, Azin AU - Büttcher, Stefan AU - MacKinnon, Ian T3 - SIGIR '08 AB - Evaluation measures act as objective functions to be optimized by information retrieval systems. Such objective functions must accurately reflect user requirements, particularly when tuning IR systems and learning ranking functions. Ambiguity in queries and redundancy in retrieved documents are poorly reflected by current evaluation measures. In this paper, we present a framework for evaluation that systematically rewards novelty and diversity. We develop this framework into a specific evaluation measure, based on cumulative gain. We demonstrate the feasibility of our approach using a test collection based on the TREC question answering track. C1 - New York, NY, USA C3 - Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval DA - 2008/// PY - 2008 DO - 10.1145/1390334.1390446 DP - ACM Digital Library SP - 659 EP - 666 LA - en PB - ACM SN - 978-1-60558-164-4 UR - http://doi.acm.org/10.1145/1390334.1390446 Y2 - 2019/01/27/19:15:02 ER - TY - CONF TI - Diversifying Search Results AU - Agrawal, Rakesh AU - Gollapudi, Sreenivas AU - Halverson, Alan AU - Ieong, Samuel T3 - WSDM '09 AB - We study the problem of answering ambiguous web queries in a setting where there exists a taxonomy of information, and that both queries and documents may belong to more than one category according to this taxonomy. We present a systematic approach to diversifying results that aims to minimize the risk of dissatisfaction of the average user. We propose an algorithm that well approximates this objective in general, and is provably optimal for a natural special case. Furthermore, we generalize several classical IR metrics, including NDCG, MRR, and MAP, to explicitly account for the value of diversification. We demonstrate empirically that our algorithm scores higher in these generalized metrics compared to results produced by commercial search engines. C1 - New York, NY, USA C3 - Proceedings of the Second ACM International Conference on Web Search and Data Mining DA - 2009/// PY - 2009 DO - 10.1145/1498759.1498766 DP - ACM Digital Library SP - 5 EP - 14 LA - en PB - ACM SN - 978-1-60558-390-7 UR - http://doi.acm.org/10.1145/1498759.1498766 Y2 - 2019/01/27/21:41:12 ER - TY - CONF TI - An Axiomatic Approach for Result Diversification AU - Gollapudi, Sreenivas AU - Sharma, Aneesh T3 - WWW '09 AB - Understanding user intent is key to designing an effective ranking system in a search engine. In the absence of any explicit knowledge of user intent, search engines want to diversify results to improve user satisfaction. In such a setting, the probability ranking principle-based approach of presenting the most relevant results on top can be sub-optimal, and hence the search engine would like to trade-off relevance for diversity in the results. In analogy to prior work on ranking and clustering systems, we use the axiomatic approach to characterize and design diversification systems. We develop a set of natural axioms that a diversification system is expected to satisfy, and show that no diversification function can satisfy all the axioms simultaneously. We illustrate the use of the axiomatic framework by providing three example diversification objectives that satisfy different subsets of the axioms. We also uncover a rich link to the facility dispersion problem that results in algorithms for a number of diversification objectives. Finally, we propose an evaluation methodology to characterize the objectives and the underlying axioms. We conduct a large scale evaluation of our objectives based on two data sets: a data set derived from the Wikipedia disambiguation pages and a product database. C1 - New York, NY, USA C3 - Proceedings of the 18th International Conference on World Wide Web DA - 2009/// PY - 2009 DO - 10.1145/1526709.1526761 DP - ACM Digital Library SP - 381 EP - 390 LA - en PB - ACM SN - 978-1-60558-487-4 UR - http://doi.acm.org/10.1145/1526709.1526761 Y2 - 2019/01/27/22:06:28 ER - TY - CONF TI - What Do Exploratory Searchers Look at in a Faceted Search Interface? AU - Kules, Bill AU - Capra, Robert AU - Banta, Matthew AU - Sierra, Tito T3 - JCDL '09 AB - This study examined how searchers interacted with a web-based, faceted library catalog when conducting exploratory searches. It applied eye tracking, stimulated recall interviews, and direct observation to investigate important aspects of gaze behavior in a faceted search interface: what components of the interface searchers looked at, for how long, and in what order. It yielded empirical data that will be useful for both practitioners (e.g., for improving search interface designs), and researchers (e.g., to inform models of search behavior). Results of the study show that participants spent about 50 seconds per task looking at (fixating on) the results, about 25 seconds looking at the facets, and only about 6 seconds looking at the query itself. These findings suggest that facets played an important role in the exploratory search process. C1 - New York, NY, USA C3 - Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries DA - 2009/// PY - 2009 DO - 10.1145/1555400.1555452 DP - ACM Digital Library SP - 313 EP - 322 LA - en PB - ACM SN - 978-1-60558-322-8 UR - http://doi.acm.org/10.1145/1555400.1555452 Y2 - 2018/08/07/18:20:12 ER - TY - CONF TI - A Comparative Analysis of Cascade Measures for Novelty and Diversity AU - Clarke, Charles L.A. AU - Craswell, Nick AU - Soboroff, Ian AU - Ashkan, Azin T3 - WSDM '11 AB - Traditional editorial effectiveness measures, such as nDCG, remain standard for Web search evaluation. Unfortunately, these traditional measures can inappropriately reward redundant information and can fail to reflect the broad range of user needs that can underlie a Web query. To address these deficiencies, several researchers have recently proposed effectiveness measures for novelty and diversity. Many of these measures are based on simple cascade models of user behavior, which operate by considering the relationship between successive elements of a result list. The properties of these measures are still poorly understood, and it is not clear from prior research that they work as intended. In this paper we examine the properties and performance of cascade measures with the goal of validating them as tools for measuring effectiveness. We explore their commonalities and differences, placing them in a unified framework; we discuss their theoretical difficulties and limitations, and compare the measures experimentally, contrasting them against traditional measures and against other approaches to measuring novelty. Data collected by the TREC 2009 Web Track is used as the basis for our experimental comparison. Our results indicate that these measures reward systems that achieve an balance between novelty and overall precision in their result lists, as intended. Nonetheless, other measures provide insights not captured by the cascade measures, and we suggest that future evaluation efforts continue to report a variety of measures. C1 - New York, NY, USA C3 - Proceedings of the Fourth ACM International Conference on Web Search and Data Mining DA - 2011/// PY - 2011 DO - 10.1145/1935826.1935847 DP - ACM Digital Library SP - 75 EP - 84 LA - en PB - ACM SN - 978-1-4503-0493-1 UR - http://doi.acm.org/10.1145/1935826.1935847 Y2 - 2019/01/27/21:34:38 ER - TY - CONF TI - On Query Result Diversification AU - Vieira, Marcos R. AU - Razente, Humberto L. AU - Barioni, Maria C. N. AU - Hadjieleftheriou, Marios AU - Srivastava, Divesh AU - Traina, Caetano AU - Tsotras, Vassilis J. T3 - ICDE '11 AB - In this paper we describe a general framework for evaluation and optimization of methods for diversifying query results. In these methods, an initial ranking candidate set produced by a query is used to construct a result set, where elements are ranked with respect to relevance and diversity features, i.e., the retrieved elements should be as relevant as possible to the query, and, at the same time, the result set should be as diverse as possible. While addressing relevance is relatively simple and has been heavily studied, diversity is a harder problem to solve. One major contribution of this paper is that, using the above framework, we adapt, implement and evaluate several existing methods for diversifying query results. We also propose two new approaches, namely the Greedy with Marginal Contribution (GMC) and the Greedy Randomized with Neighborhood Expansion (GNE) methods. Another major contribution of this paper is that we present the first thorough experimental evaluation of the various diversification techniques implemented in a common framework. We examine the methods' performance with respect to precision, running time and quality of the result. Our experimental results show that while the proposed methods have higher running times, they achieve precision very close to the optimal, while also providing the best result quality. While GMC is deterministic, the randomized approach (GNE) can achieve better result quality if the user is willing to tradeoff running time. C1 - Washington, DC, USA C3 - Proceedings of the 2011 IEEE 27th International Conference on Data Engineering DA - 2011/// PY - 2011 DO - 10.1109/ICDE.2011.5767846 DP - ACM Digital Library SP - 1163 EP - 1174 LA - en PB - IEEE Computer Society SN - 978-1-4244-8959-6 UR - http://dx.doi.org/10.1109/ICDE.2011.5767846 Y2 - 2019/01/27/22:10:26 ER -