Balisage Paper: When 57,300,000 Full Text Search Results Are Just Too Many
Pat Case
Congressional Research Service, Library of Congress
Abstract
The Web changed the paradigm for full-text search. Searching Google for search engines returns 57,300,000 results at this writing, an impressive result set. Web search engines favor simple searches, speed, and relevance ranking. The end user most often finds a wanted result or two within the first page of search results. This new paradigm is less useful in searching collections of homogeneous data and documents than it is for searching the web. When searching collections end users may need to review everything in the collection on a topic, or may want a clean result set of only those 6 high-quality results, or may need to confirm that there are no wanted results because finding no results within a collection sometimes answers a question about a topic or collection. To accomplish these tasks, end users may need more end user functionality to return small, manageable result sets. The W3C XQuery and XPath Full Text Recommendation (XQFT) offers extensive end user functionality, restoring the end user control that librarians and expert searches enjoyed before the Web. XQFT offers more end user functionality and control than any other full-text search standard ever: more match options, more logical operators, more proximity operators, more ways to return a manageable result set. XQFT searches are also completely composable with XQuery string, number, date, and node queries, bringing the power of full-text search and database querying together for the first time. XQFT searches run directly against XML, enabling searches on any elements or attributes. XQFT implementations are standard-driven, based on shared semantics and syntax. A search in any implementation is portable and may be used in other implementations.