EnterpriseSearchCenter.com Home
  News   Features   White Papers   Research Reports   Web Events   Conferences  
 
RESOURCES FOR EVALUATING ENTERPRISE SEARCH TECHNOLOGIES
December 13, 2006

Table of Contents

Featured Content: Federated Search Library Market Study from Faulkner (free PDF)
FAST Acquires Platefood Limited
Swets and MuseGlobal Partner to Deliver SwetsWise Searcher Hosted Solution
DocuLex and dtSearch Introduce New Version of WebSearch
Hot Banana Unveils Multilingual Web CMS Module
Ask.com Debuts AskCity
Content analytics
XMetaL marks its spot
Searching for BI
More counterterrorism partnering

Featured Content: Federated Search Library Market Study from Faulkner (free PDF)

Preview

Federated search technology provides libraries and other research centers with a means to offer patrons a unified search interface for accessing disparate resources. Improving search services is critical for meeting the expectations of today's library patrons, who are accustomed to the speed and comprehensiveness of Web searches.  Click here for your free PDF of this report.

Report Contents

  • Executive Summary
  • Market Dynamics
  • Market Leaders
  • Market Trends
  • Strategic Planning Implications
  • Web Links
  • Related Reports

Executive Summary

Federated search technology provides a single interface to search multiple databases, returning an integrated search results list. The technology is primarily used by libraries to provide patrons with a unified interface for searching multiple subscription information services. Most products let users personalize their results to some extent, such as by picking how they want results sorted (e.g., by source, by title), how many results they want per page, and what level of detail they want to have displayed for each result (e.g., title only, summary).

Due to the prevalence of Web searching, today's users have high expectations for search technology, but current federated search does not meet these expectations. The results provided are not focused enough, and they are often returned too slowly. Search technology is still struggling to meet user demands, and commercial and University-based development projects are underway.  

The federated search market consists primarily of companies that focus on services for libraries. Google is beta testing services for academic research, however. Although Google Scholar only provides access to document excerpts, it could serve as a model for the future.

Organizations that are considering purchasing a federated search product should use caution. Not only does the technology have some noteworthy shortcomings, but it sometimes requires extensive customization to meet the specific needs of individual libraries. Many libraries may be better served by having their federated search application hosted by a vendor.

Market Dynamics

Today's library patrons expect that search services will be as simple as Google or Yahoo. Libraries subscribe to a range of subscription information services, however, so searching these resource has historically been complex and time consuming. Federated search technology aims to simplify this process for users by providing a single interface that can be used to search multiple data sources. The results are presented in an integrated list or in a list that is segmented according to the source of the result.

The technology is primarily used by libraries and other research centers that subscribe to licensed databases. Licensed databases are subscription-based services that collect information on a particular topic (e.g., medical information) or from a particular type of source (e.g., academic journals). To perform this task, the technology translates a user's natural language or Boolean query into a command that a database's native search function understands. This translation occurs for each database being searched. Federated search technology also has the ability to search freely available sources such as the Web and online public access catalogs (an online bibliography of a library's catalog.)

University libraries in particular are feeling pressure to improve their search interfaces. Many students are bypassing certain traditional academic research sources in favor of easier to find but possibly less reliable Web sources. Technologically, libraries have struggled to keep pace. For instance, the Z39.50 standard was designed to provide unified searching across multiple resources. Although it is still used, this standard cannot provide access to all of the sources that today's users need to access.

Market Trends

The federated search market is populated mainly by specialty vendors focused on software and services for libraries and other research-based organizations. Large companies with broader focuses have shown little interest in pursuing the market. A notable exception is Google, whose Google Scholar service is now available in a beta version. Google has reached agreements with several providers of subscription-based content to make their information available through Google Scholar. This content would not otherwise be reached by Google or any ordinary search engine.

The Google Scholar model is a test of a federated search approach much different from most currently available commercial products. Whereas products from WebFeat, Ex Libras, and others enable users to search data sources that a library has licensed, Google Scholar allows any Web user to search through sources without charge.

Google Scholar is not yet a true competitor to other players in the federated search market. One reason is that libraries cannot install it to search their resources--it is available for free directly to the public. Another reason is that it does not provide access to the full text of articles: users can only access excerpts. There has been considerable controversy about what Google should allow users to access and about what affect this access might have on the rights and business interests of copyright holders. Although these issues have yet to be settled, Google Scholar provides an interesting test case of a different approach to searching. In the future, an all-encompassing approach may appeal to users more than a search service offered by a single library.

One-size-fits-all search services may prove to have limited effectiveness. The prospect of meeting the needs of all users may simply be too difficult. An alternative, and potentially better, approach is the use of multiple specialized search portals. An example of this approach comes from the California Digital Library, a research effort supported by the University of California that is conducting some of the most progressive research in federated search technology. Rather than trying to create a single search interface for every user, it is developing a series of portals, each with a different focus. For instance, its SmartStart is targeted specifically at undergraduates. With this focus in mind, SmartStart presents a limited number of high quality results rather than overwhelming students with a large number of results to parse.

Market Leaders

Auto-Graphics. Auto-Graphics offers AGent Portal, a Web-based search tool that can search a user's choice of resources, including licensed and unlicensed databases, catalogs, the Web, Z39.50 sources, among other resources. Libraries can integrate AGent Portal into their existing Web sites or customize the interface and the way that search results are presented. Individual users can also customize their own search results and save searches with the company's My AGent software. If it is integrated with the company's AGent Resource Sharing application, AGent Portal can be used to facilitate the searching and borrowing of materials from other libraries. Auto-Graphics also offers AGent Verso, an integrated library system product, AGent Digital Collections, which enables search of specific types of digital collections (e.g., maps, sound recordings), and AGent MARCit, which is for cataloguing.

The Ex Libris Group. The Ex LIbris Group's MetaLib is a portal that enables libraries to conduct federated searches of their digital information sources. The company's DigiTool lets libraries or groups of libraries manage digital assets, storing metadata about the files to facilitate better searching. The company also offers Aleph, a library automation product, SFX, which is for linking to electronic resources, and Verde, a tool designed to help libraries manage the electronic portions of their collections. In addition, the company offers application hosting services for its products.

Fretwell-Downing Informatics.  Fretwell-Downing Informatics (FDI), headquartered in England, focuses on a variety of information management areas. The company's ZPORTAL lets users sort, filter, and save search results as well as customize their interfaces. Libraries can create "virtual collections" to group resources by type for more focused searching. ZPORTAL integrates with other FDI products, including Virtual Document Exchange for managing resources held by other libraries, Z'MBOL for unified searching of text and metadata, and Z2WEB for making proprietary sources accessible to Z39.50 standard searches. In late 2005, OCLC PICA, a large European library services provider, acquired Fretwell-Downing. OCLC PICA soon after merged with Sisis Informationsysteme, a German provider of library services. These changes are too recent to have impacted ZPORTAL, but they may in the future.

SerialsSolutions. Founded by a librarian, SerialsSolutions focuses on products designed to help users search electronic serials. The company's federated search offering is CentralSearch, a subscription service. It can search full text, A&I, and free databases as well as a library's own catalog. Libraries can customize the interface for their particular needs. The company also provides ArticleLinker, which resolves links to references in search results and provides paths to digitally-stored information. In addition, the SerialsSolutions supplies Access & Management Suite (AMS), which aims to help libraries make their serials more easily available to users, FullMarc records service for maintaining online public access catalogs, and Electronic Resource Management System (ERMS) for discovering, tracking, and managing electronic resources such as journals.

WebFeat. WebFeat is dedicated exclusively to search technology for libraries. The current version of WebFeat's federated search offering is WebFeat 3. It can be used with licensed and unlicensed databases, Z39.50 sources, and other resources. The SMART tool lets libraries track patrons' usage of the system. Through the MyWebFeat component, users can personalize their search results as well as the interface itself. As an option, WebFeat can host the system. Libraries can integrate WebFeat into their own Web page or build custom interfaces from pre-existing templates. The system's administration console lets libraries choose which targets to search; different groups of targets can be designated for different subject areas so that search results will be more focused.

Strategic Planning Implications

The federated search system should return results in a well-organized, quickly-graspable format. Two key ways that products attempt to do this are by de-duping, which means removing duplicate results from the final list, and by ranking results according to their relevancy. Although some duplicates are likely inevitable, and relevancy rankings cannot perfectly reflect a user's judgment on what is important, these are important features to evaluate. Various products perform these tasks with differing levels of effectiveness, so soliciting feedback from organizations that are currently using the products under evaluation and, if possible, testing the products directly will aid in the product selection process.

In order for a federated search tool to access a licensed database, it must authenticate to it. Some products will fail to authenticate to certain databases. Before purchase, an organization should ensure that the tool can authenticate to all the services to which it subscribes. If possible, the prospective buyer should also identify the services it will subscribe to in the foreseeable future and ensure that the product under evaluation can authenticate to them as well.

When evaluating the results that an enterprise search offering provides, quality should be weighted much more heavily than quantity. Metrics that measure search effectiveness by the number of hits have limited value. Instead, the quality of results is much more important. Unfortunately, there are no objectives measures of quality--quality can only be measured by users performing a search and making judgments based on how helpful the results are. Soliciting feedback from libraries that are already using the product will provide with type of feedback.

In most cases, federated search technology will require customization to access all the information targets a library requires and to provide an interface that is suitable for a particular library. The closer a product is to meeting these goals out-of-the-box, the better, but customers should expect to spend a considerable amount of time customizing the system.

Federated search technology is still developing, having yet to reach maturity. Reports from users indicate that federated search technology has noteworthy shortcomings. The primary end users of federated search technology are not professional researchers, but ordinary students and library visitors. Therefore, queries will typically be in natural language, and results may be better when they are focused, with limited sifting required. Implementing several themed portals on top of the same underlying technology will improve the search experience for many users.

Ultimately, it is impossible to determine the future effectiveness of a federated search product. Customers should therefore be cautious about investing too heavily in a particular technology. Any commitment to a certain vendor or product should include an understanding of upgrade options to future versions. Vendors that are aggressively improving their technology should be preferred over those that are putting less effort into development. Some vendors offer their software as a hosted service; this option may reduce a library's commitment to a single technology, providing more flexibility for the future than would an in-house implementation.

Click here for your free PDF of this report.

About the Author

Geoff Keston is a project manager for a leading technology consulting and services company. In this role, he has been responsible for the successful completion of enterprise software implementations, network upgrades, and telephony implementations for major retailers, financial firms, and public institutions. Geoff also writes extensively on issues relating to software, data networking, and e-commerce, as well as on the cultural, economic, and political issues raised by technology. He is a Microsoft Certified Systems Engineer and a Certified Novell Administrator.

Web Links

Back to Contents...

FAST Acquires Platefood Limited

Fast Search & Transfer, a developer of search technologies, has announced it has signed a definitive agreement to acquire Platefood Limited. Platefood Limited was originally set up as a joint investment in 2005 by Schibsted S0K AS, Sensis Pty, and FAST to offer online search services and search-based advertising solutions to media and online directories companies.

FAST has acquired all the remaining shares in Platefood Limited including 19.99% from Schibsted S0K AS and 61.01% from Sensis Pty Ltd for EUR 8,100,000 in cash. FAST owned 19% prior to this transaction. The deal closed November 27, 2006, and FAST is accounting for the transaction using the purchase method. Platefood will be included in the consolidated financial statements of FAST from the acquisition date. FAST intends to consolidate the Platefood business into its current operations to capitalize on the market for search-based monetization solutions. Platefood Performance, a search monetization and advertising solution for selling, managing, and delivering pay-for-performance advertising, leverages FAST's Enterprise Search Platform (ESP) to offer a flexible solution. Schibsted and Sensis will continue as customers of the Platefood monetization solution.

As part of the agreement, FAST will assume all employees of Platefood, the majority of which are located in London. FAST will continue to offer full support and maintenance to Platefood customers without interruption.

(www.fastsearch.com)  

Back to Contents...

Swets and MuseGlobal Partner to Deliver SwetsWise Searcher Hosted Solution

Swets has launched the SwetsWise Searcher hosted service. It is the result of an alliance between Swets and MuseGlobal, Inc. The new SwetsWise Searcher extends the offering to two versions: SwetsWise Searcher hosted version, which provides an economical federated search option for organizations that prefer off-site implementation and SwetsWise Searcher Enterprise, the existing version that is implemented locally on the customer's servers.

SwetsWise Searcher delivers a web-based federated search service designed to facilitate implementation. For a single competitive price, customers can select up to 30 source packages from a list. A source package to SwetsWise Online Content--Swets' ejournal gateway providing a single point of access to a collections of ejournals--is automatically included. Capabilities such as searching open access catalogs and exporting citation information to citation management software in real-time will be included. Individual users will also be able to save search results in their own "Personal WorkRoom" and set many search preferences using SwetsWise Searcher's personalized profile capabilities. Its web-based utilities are designed to enable control over the configuration and customization process, allowing customer administrators to change interface and system options without assistance from Swets.

(www.swets.com; www.museglobal.com)

Back to Contents...

DocuLex and dtSearch Introduce New Version of WebSearch

DocuLex, Inc., a document management company, and dtSearch Corp., a developer of text search software for enterprise and developer customers around the world, have introduced a new version of WebSearch, DocuLex's web-based document management application, providing "Instant Document Access" from any location. The new version includes Active Directory and public key infrastructure (PKI) enhancements. Active Directory integration is designed to enable heightened security via access permission grants and subsequent tracking. WebSearch also serves as a document hosting facilitator, with PKI providing encryption and digital signature security for outsourcing daily use information access to stored electronic files. The program manages SSL remote security and retrieval of various file formats, including MS Office, PDF, and email. Retrieval tracking and activity logging assists with compliance of privacy laws, including HIPAA and Sarbanes-Oxley.

Also new for users is an updated interface with functionality similar to an internet search engine key word query. The server-based software automates complex document organizational functions, including file room views, which provide a visual representation of hard copy document storage housed in a physical records library. DocuLex WebSearch utilizes the dtSearch Engine for Win & .NET. The dtSearch Engine can index over a terabyte of text in a single index, as well as create and simultaneously search an unlimited number of indexes. Indexed search time is typically less than a second, across terabytes of data. The dtSearch Engine supports distributed or federated searching across multiple data sources, and includes access to dtSearch's built-in Web Spider. The dtSearch Engine offers more than two dozen search options with support for various data types. After a search, the dtSearch Engine highlights hits in HTML, XML, and PDF, while displaying links, formatting, and images. The built-in dtSearch Spider supports static and dynamic web content, with WYSIWYG hit-highlighting. Additionally the dtSearch Engine converts other data types (word processor, database, spreadsheet, email and attachments, ZIP, Unicode, etc.) to HTML for display with highlighted hits. WebSearch is a component of DocuLex's flagship Archive Studio, a portfolio consisting of open system document management software including capture and retrieval programs for any business environment.

(www.doculex.com; www.dtsearch.com)  

Back to Contents...

Hot Banana Unveils Multilingual Web CMS Module

Hot Banana Software, Inc., a provider of web content management software for marketing and a wholly owned subsidiary of J.L. Halsey, has introduced a multilingual content management plug-in that automates the process of translating and publishing website content into multiple languages. Connexion Corporate Communications s.a., a Belgium business-to-business multilingual communications agency and integrator of web content management software in Europe, independently developed the multilingual content management module and is making the module available to all Hot Banana channel partners and clients.

From the Hot Banana interface, site publishers can send web pages, portions of web pages and even entire websites to an XML-compatible translation memory system (TMS), such as DéjàVu, Trados, or SDLX. Once human translators have edited the content in their TMS of choice, their translation work is sent back to Hot Banana's web CMS for publication. The translated content, navigation, metadata, pictures, and links appear in the same layout as the original pages, which the Hot Banana user can then choose to further validate, get approvals, or instantly publish. The new multilingual translation module works with any Hot Banana implementation. Connexion Corporate Communications licenses the module direct.

(www.hotbanana.com; www.connexion.be)

Back to Contents...

Ask.com Debuts AskCity

Ask.com, a search destination and wholly-owned business of IAC/InterActiveCorp, has introduced AskCity, a new local search service that integrates local information on the web with an "all-in-one" user interface and search tools. AskCity represents a cross-IAC integration, leveraging content from several IAC properties, including Citysearch, Ticketmaster and ServiceMagic, among others. Also integrated into AskCity are content and functionality from non-IAC properties, including Fandango and Opentable. The new features of AskCity include the ability to search four verticals in a single place: AskCity integrates businesses/services, events, movies, maps, and directions in a single interface (Business, Event, Movie, and Map and Directions).

AskCity is presented in a three-column format to allow users to search, organize, share, and transact all on one page, without screen reloads. The search results are displayed in the center pane and each of those results is integrated onto the map in the right pane. The left pane hosts the search box and a series of links that let the user refine results. AskCity offers broad and deep local content, including: data acquired through direct partnerships with brands like Active, Citysearch, Eventsource, Fandango, MuseumTix, Opentable, Reserve America, ServiceMagic, Ticketmaster, TicketWeb, TripAdvisor, StepUp, Wcities, and others. AskCity incorporates direct access within the search results to transaction sites, including Ticketmaster and TicketWeb (to purchase event tickets), Fandango (to purchase movie tickets), Opentable (to reserve a table), and ServiceMagic (to arrange an appointment for a contractor or other service). Users can narrow searches by zip code and even neighborhood. AskCity "bounds" the area graphically on the associated map for visual reference.

Search with suggestions automatically generated by AskCity to narrow or expand searches by neighborhood, cuisine, or by movie genre, to name a few. Use these suggestions to iterate queries or to explore. Search results can be saved on the search results pane with the "Pin It" link. This allows users to continue searching (without losing pinned results) and save additional results from subsequent searches. In effect, this allows users to create an itinerary to plan a date or run an errand.

(www.ask.com)

Back to Contents...

Content analytics

IBM has announced major steps intended to assist in the open development and standardization of search and content analytics software.

The Organization for the Advancement of Structured Information Standards (OASIS) has established a technical committee to standardize the Unstructured Information Management Architecture (UIMA) specification. Additionally, the Apache Software Foundation has established an incubator project for developing UIMA-based software. These efforts are based on IBM's development of UIMA software and its experience with clients and partners in deploying content analytic solutions.

The new Apache incubator project will start with an initial contribution from IBM of the UIMA Version 2.0 source code. The Apache Software Foundation provides support for open-source software projects characterized by a collaborative, consensus-based development process, an open, pragmatic software license and a desire to create high-quality software.

In addition, Carnegie Mellon University's Language Technology Institute is hosting a UIMA Component Repository, where developers can post information about their analytics components and anyone can find out more about free and commercially available UIMA-compliant analytics. Free analytic tools that can work with UIMA include those from the General Architecture for Text Engineering (GATE) and OpenNLP communities. Commercial analytics are available from IBM, as well as from other software vendors such as Attensity, ClearForest, Temis and Nstein.

Back to Contents...

XMetaL marks its spot

XMetaL has released Author 5.0, an enterprise-ready content creation and publishing solution that can be deployed and integrated with the rest of an organization's content management system. XMetaL Author 5.0 is available in the following editions: Author, DITA Edition, Enterprise Edition, and XMAX--an embeddable, Web-based edition.

XMetaL reports new functionality in Author 5.0 includes:

  • more publishing options--the ability to publish cost-effectively, directly from the desktop, with the extensible publishing framework that includes extended support for the DITA Open Toolkit, as well as the RenderX-powered XSL-FO engine for high-quality PDF output;
  • improved content repository or management system (CMS) integration--capability for a single-interface access enabled by the new XMetaL Connector; and
  • enhanced DITA support--DITA specialization support for the rapid development of authoring interfaces for new content types and extended DITA map-editing capabilities for leveraging the full power of DITA maps for content reuse and document assembly.

Back to Contents...

Searching for BI

Endeca and Clarabridge have announced a partnership that calls for integration between the Endeca Information Access Platform (IAP) and popular business intelligence (BI) tools from Clarabridge.

According to the terms of the deal, Endeca will resell Clarabridge BI Connectors for Business Objects, Cognos and MicroStrategy BI platforms as value-added components of the Endeca IAP. Endeca says the combined offering will help its IAP customers extract additional value from BI investments, unite reports and key metrics with other key application data and content sources, and extend business intelligence to a wide audience of users.

Back to Contents...

More counterterrorism partnering

Inxight Federal Systems Group has partnered with Mosaic, a small business providing consulting to the U.S. government, to offer professional services and knowledge-based solutions to government organizations focused on enemy threats.

Inxight reports its partnership with Mosaic will bring the ability to understand and represent the requirements of analysts in defining and delivering enterprise solutions. Mosaic employs senior technical resources across a broad range of technical competencies that are essential to the future of analytical systems development.

Back to Contents...
 
[Newsletters] [Home]

Problems with this site? Please contact the webmaster. | About ITI | Privacy Policy