Federated search technology provides libraries and other research centers with a means to offer patrons a unified search interface for accessing disparate resources. Improving search services is critical for meeting the expectations of today's library patrons, who are accustomed to the speed and comprehensiveness of Web searches. Click here for your free PDF of this report.
- Executive Summary
- Market Dynamics
- Market Leaders
- Market Trends
- Strategic Planning Implications
- Web Links
- Related Reports
Federated search technology provides a single interface to search multiple databases, returning an integrated search results list. The technology is primarily used by libraries to provide patrons with a unified interface for searching multiple subscription information services. Most products let users personalize their results to some extent, such as by picking how they want results sorted (e.g., by source, by title), how many results they want per page, and what level of detail they want to have displayed for each result (e.g., title only, summary).
Due to the prevalence of Web searching, today's users have high expectations for search technology, but current federated search does not meet these expectations. The results provided are not focused enough, and they are often returned too slowly. Search technology is still struggling to meet user demands, and commercial and University-based development projects are underway.
The federated search market consists primarily of companies that focus on services for libraries. Google is beta testing services for academic research, however. Although Google Scholar only provides access to document excerpts, it could serve as a model for the future.
Organizations that are considering purchasing a federated search product should use caution. Not only does the technology have some noteworthy shortcomings, but it sometimes requires extensive customization to meet the specific needs of individual libraries. Many libraries may be better served by having their federated search application hosted by a vendor.
Today's library patrons expect that search services will be as simple as Google or Yahoo. Libraries subscribe to a range of subscription information services, however, so searching these resource has historically been complex and time consuming. Federated search technology aims to simplify this process for users by providing a single interface that can be used to search multiple data sources. The results are presented in an integrated list or in a list that is segmented according to the source of the result.
The technology is primarily used by libraries and other research centers that subscribe to licensed databases. Licensed databases are subscription-based services that collect information on a particular topic (e.g., medical information) or from a particular type of source (e.g., academic journals). To perform this task, the technology translates a user's natural language or Boolean query into a command that a database's native search function understands. This translation occurs for each database being searched. Federated search technology also has the ability to search freely available sources such as the Web and online public access catalogs (an online bibliography of a library's catalog.)
University libraries in particular are feeling pressure to improve their search interfaces. Many students are bypassing certain traditional academic research sources in favor of easier to find but possibly less reliable Web sources. Technologically, libraries have struggled to keep pace. For instance, the Z39.50 standard was designed to provide unified searching across multiple resources. Although it is still used, this standard cannot provide access to all of the sources that today's users need to access.
The federated search market is populated mainly by specialty vendors focused on software and services for libraries and other research-based organizations. Large companies with broader focuses have shown little interest in pursuing the market. A notable exception is Google, whose Google Scholar service is now available in a beta version. Google has reached agreements with several providers of subscription-based content to make their information available through Google Scholar. This content would not otherwise be reached by Google or any ordinary search engine.
The Google Scholar model is a test of a federated search approach much different from most currently available commercial products. Whereas products from WebFeat, Ex Libras, and others enable users to search data sources that a library has licensed, Google Scholar allows any Web user to search through sources without charge.
Google Scholar is not yet a true competitor to other players in the federated search market. One reason is that libraries cannot install it to search their resources--it is available for free directly to the public. Another reason is that it does not provide access to the full text of articles: users can only access excerpts. There has been considerable controversy about what Google should allow users to access and about what affect this access might have on the rights and business interests of copyright holders. Although these issues have yet to be settled, Google Scholar provides an interesting test case of a different approach to searching. In the future, an all-encompassing approach may appeal to users more than a search service offered by a single library.
One-size-fits-all search services may prove to have limited effectiveness. The prospect of meeting the needs of all users may simply be too difficult. An alternative, and potentially better, approach is the use of multiple specialized search portals. An example of this approach comes from the California Digital Library, a research effort supported by the University of California that is conducting some of the most progressive research in federated search technology. Rather than trying to create a single search interface for every user, it is developing a series of portals, each with a different focus. For instance, its SmartStart is targeted specifically at undergraduates. With this focus in mind, SmartStart presents a limited number of high quality results rather than overwhelming students with a large number of results to parse.
Auto-Graphics. Auto-Graphics offers AGent Portal, a Web-based search tool that can search a user's choice of resources, including licensed and unlicensed databases, catalogs, the Web, Z39.50 sources, among other resources. Libraries can integrate AGent Portal into their existing Web sites or customize the interface and the way that search results are presented. Individual users can also customize their own search results and save searches with the company's My AGent software. If it is integrated with the company's AGent Resource Sharing application, AGent Portal can be used to facilitate the searching and borrowing of materials from other libraries. Auto-Graphics also offers AGent Verso, an integrated library system product, AGent Digital Collections, which enables search of specific types of digital collections (e.g., maps, sound recordings), and AGent MARCit, which is for cataloguing.
The Ex Libris Group. The Ex LIbris Group's MetaLib is a portal that enables libraries to conduct federated searches of their digital information sources. The company's DigiTool lets libraries or groups of libraries manage digital assets, storing metadata about the files to facilitate better searching. The company also offers Aleph, a library automation product, SFX, which is for linking to electronic resources, and Verde, a tool designed to help libraries manage the electronic portions of their collections. In addition, the company offers application hosting services for its products.
Fretwell-Downing Informatics. Fretwell-Downing Informatics (FDI), headquartered in England, focuses on a variety of information management areas. The company's ZPORTAL lets users sort, filter, and save search results as well as customize their interfaces. Libraries can create "virtual collections" to group resources by type for more focused searching. ZPORTAL integrates with other FDI products, including Virtual Document Exchange for managing resources held by other libraries, Z'MBOL for unified searching of text and metadata, and Z2WEB for making proprietary sources accessible to Z39.50 standard searches. In late 2005, OCLC PICA, a large European library services provider, acquired Fretwell-Downing. OCLC PICA soon after merged with Sisis Informationsysteme, a German provider of library services. These changes are too recent to have impacted ZPORTAL, but they may in the future.
SerialsSolutions. Founded by a librarian, SerialsSolutions focuses on products designed to help users search electronic serials. The company's federated search offering is CentralSearch, a subscription service. It can search full text, A&I, and free databases as well as a library's own catalog. Libraries can customize the interface for their particular needs. The company also provides ArticleLinker, which resolves links to references in search results and provides paths to digitally-stored information. In addition, the SerialsSolutions supplies Access & Management Suite (AMS), which aims to help libraries make their serials more easily available to users, FullMarc records service for maintaining online public access catalogs, and Electronic Resource Management System (ERMS) for discovering, tracking, and managing electronic resources such as journals.
WebFeat. WebFeat is dedicated exclusively to search technology for libraries. The current version of WebFeat's federated search offering is WebFeat 3. It can be used with licensed and unlicensed databases, Z39.50 sources, and other resources. The SMART tool lets libraries track patrons' usage of the system. Through the MyWebFeat component, users can personalize their search results as well as the interface itself. As an option, WebFeat can host the system. Libraries can integrate WebFeat into their own Web page or build custom interfaces from pre-existing templates. The system's administration console lets libraries choose which targets to search; different groups of targets can be designated for different subject areas so that search results will be more focused.
Strategic Planning Implications
The federated search system should return results in a well-organized, quickly-graspable format. Two key ways that products attempt to do this are by de-duping, which means removing duplicate results from the final list, and by ranking results according to their relevancy. Although some duplicates are likely inevitable, and relevancy rankings cannot perfectly reflect a user's judgment on what is important, these are important features to evaluate. Various products perform these tasks with differing levels of effectiveness, so soliciting feedback from organizations that are currently using the products under evaluation and, if possible, testing the products directly will aid in the product selection process.
In order for a federated search tool to access a licensed database, it must authenticate to it. Some products will fail to authenticate to certain databases. Before purchase, an organization should ensure that the tool can authenticate to all the services to which it subscribes. If possible, the prospective buyer should also identify the services it will subscribe to in the foreseeable future and ensure that the product under evaluation can authenticate to them as well.
When evaluating the results that an enterprise search offering provides, quality should be weighted much more heavily than quantity. Metrics that measure search effectiveness by the number of hits have limited value. Instead, the quality of results is much more important. Unfortunately, there are no objectives measures of quality--quality can only be measured by users performing a search and making judgments based on how helpful the results are. Soliciting feedback from libraries that are already using the product will provide with type of feedback.
In most cases, federated search technology will require customization to access all the information targets a library requires and to provide an interface that is suitable for a particular library. The closer a product is to meeting these goals out-of-the-box, the better, but customers should expect to spend a considerable amount of time customizing the system.
Federated search technology is still developing, having yet to reach maturity. Reports from users indicate that federated search technology has noteworthy shortcomings. The primary end users of federated search technology are not professional researchers, but ordinary students and library visitors. Therefore, queries will typically be in natural language, and results may be better when they are focused, with limited sifting required. Implementing several themed portals on top of the same underlying technology will improve the search experience for many users.
Ultimately, it is impossible to determine the future effectiveness of a federated search product. Customers should therefore be cautious about investing too heavily in a particular technology. Any commitment to a certain vendor or product should include an understanding of upgrade options to future versions. Vendors that are aggressively improving their technology should be preferred over those that are putting less effort into development. Some vendors offer their software as a hosted service; this option may reduce a library's commitment to a single technology, providing more flexibility for the future than would an in-house implementation.
Click here for your free PDF of this report.
About the Author
Geoff Keston is a project manager for a leading technology consulting and services company. In this role, he has been responsible for the successful completion of enterprise software implementations, network upgrades, and telephony implementations for major retailers, financial firms, and public institutions. Geoff also writes extensively on issues relating to software, data networking, and e-commerce, as well as on the cultural, economic, and political issues raised by technology. He is a Microsoft Certified Systems Engineer and a Certified Novell Administrator.