TY - JOUR TI - Determining the informational, navigational, and transactional intent of Web queries AU - Jansen, Bernard J. AU - Booth, Danielle L. AU - Spink, Amanda T2 - Information Processing & Management AB - In this paper, we define and present a comprehensive classification of user intent for Web searching. The classification consists of three hierarchical levels of informational, navigational, and transactional intent. After deriving attributes of each, we then developed a software application that automatically classified queries using a Web search engine log of over a million and a half queries submitted by several hundred thousand users. Our findings show that more than 80% of Web queries are informational in nature, with about 10% each being navigational and transactional. In order to validate the accuracy of our algorithm, we manually coded 400 queries and compared the results from this manual classification to the results determined by the automated method. This comparison showed that the automatic classification has an accuracy of 74%. Of the remaining 25% of the queries, the user intent is vague or multi-faceted, pointing to the need for probabilistic classification. We discuss how search engines can use knowledge of user intent to provide more targeted and relevant results in Web searching. DA - 2008/05/01/ PY - 2008 DO - 10.1016/j.ipm.2007.07.015 DP - ScienceDirect VL - 44 IS - 3 SP - 1251 EP - 1266 J2 - Information Processing & Management LA - en SN - 0306-4573 UR - http://www.sciencedirect.com/science/article/pii/S030645730700163X Y2 - 2018/03/28/23:33:46 ER - TY - JOUR TI - Search log analysis: What it is, what's been done, how to do it AU - Jansen, Bernard J. T2 - Library & Information Science Research AB - The use of data stored in transaction logs of Web search engines, Intranets, and Web sites can provide valuable insight into understanding the information-searching process of online searchers. This understanding can enlighten information system design, interface development, and devising the information architecture for content collections. This article presents a review and foundation for conducting Web search transaction log analysis. A methodology is outlined consisting of three stages, which are collection, preparation, and analysis. The three stages of the methodology are presented in detail with discussions of goals, metrics, and processes at each stage. Critical terms in transaction log analysis for Web searching are defined. The strengths and limitations of transaction log analysis as a research method are presented. An application to log client-side interactions that supplements transaction logs is reported on, and the application is made available for use by the research community. Suggestions are provided on ways to leverage the strengths of, while addressing the limitations of, transaction log analysis for Web-searching research. Finally, a complete flat text transaction log from a commercial search engine is available as supplementary material with this manuscript. DA - 2006/09/01/ PY - 2006 DO - 10.1016/j.lisr.2006.06.005 DP - ScienceDirect VL - 28 IS - 3 SP - 407 EP - 432 J2 - Library & Information Science Research LA - en SN - 0740-8188 ST - Search log analysis UR - http://www.sciencedirect.com/science/article/pii/S0740818806000673 Y2 - 2018/03/20/23:18:18 ER - TY - JOUR TI - Searching the web: The public and their queries AU - Spink, Amanda AU - Wolfram, Dietmar AU - Jansen, Major B. J. AU - Saracevic, Tefko T2 - Journal of the American Society for Information Science and Technology AB - In studying actual Web searching by the public at large, we analyzed over one million Web queries by users of the Excite search engine. We found that most people use few search terms, few modified queries, view few Web pages, and rarely use advanced search features. A small number of search terms are used with high frequency, and a great many terms are unique; the language of Web queries is distinctive. Queries about recreation and entertainment rank highest. Findings are compared to data from two other large studies of Web queries. This study provides an insight into the public practices and choices in Web searching. DA - 2001/// PY - 2001 DO - 10.1002/1097-4571(2000)9999:9999<::AID-ASI1591>3.0.CO;2-R DP - Wiley Online Library VL - 52 IS - 3 SP - 226 EP - 234 LA - en SN - 1532-2890 ST - Searching the web UR - https://onlinelibrary.wiley.com/doi/abs/10.1002/1097-4571%282000%299999%3A9999%3C%3A%3AAID-ASI1591%3E3.0.CO%3B2-R Y2 - 2019/01/21/23:57:39 ER - TY - JOUR TI - Real life, real users, and real needs: a study and analysis of user queries on the web AU - Jansen, Bernard J. AU - Spink, Amanda AU - Saracevic, Tefko T2 - Information Processing & Management AB - We analyzed transaction logs containing 51,473 queries posed by 18,113 users of Excite, a major Internet search service. We provide data on: (i) sessions — changes in queries during a session, number of pages viewed, and use of relevance feedback; (ii) queries — the number of search terms, and the use of logic and modifiers; and (iii) terms — their rank/frequency distribution and the most highly used search terms. We then shift the focus of analysis from the query to the user to gain insight to the characteristics of the Web user. With these characteristics as a basis, we then conducted a failure analysis, identifying trends among user mistakes. We conclude with a summary of findings and a discussion of the implications of these findings. DA - 2000/03/01/ PY - 2000 DO - 10.1016/S0306-4573(99)00056-4 DP - ScienceDirect VL - 36 IS - 2 SP - 207 EP - 227 J2 - Information Processing & Management LA - en SN - 0306-4573 ST - Real life, real users, and real needs UR - http://www.sciencedirect.com/science/article/pii/S0306457399000564 Y2 - 2019/01/27/22:52:32 ER - TY - JOUR TI - Analysis of a Very Large Web Search Engine Query Log AU - Silverstein, Craig AU - Marais, Hannes AU - Henzinger, Monika AU - Moricz, Michael T2 - SIGIR Forum AB - In this paper we present an analysis of an AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents almost 285 million user sessions, each an attempt to fill a single information need. We present an analysis of individual queries, query duplication, and query sessions. We also present results of a correlation analysis of the log entries, studying the interaction of terms within queries. Our data supports the conjecture that web users differ significantly from the user assumed in the standard information retrieval literature. Specifically, we show that web users type in short queries, mostly look at the first 10 results only, and seldom modify the query. This suggests that traditional information retrieval techniques may not work well for answering web search requests. The correlation analysis showed that the most highly correlated items are constituents of phrases. This result indicates it may be useful for search engines to consider search terms as parts of phrases even if the user did not explicitly specify them as such. DA - 1999/09// PY - 1999 DO - 10.1145/331403.331405 DP - ACM Digital Library VL - 33 IS - 1 SP - 6 EP - 12 LA - en SN - 0163-5840 UR - http://doi.acm.org/10.1145/331403.331405 Y2 - 2018/03/29/00:38:59 ER -