EnterpriseSearchCenter.com Home
  News   Features   White Papers   Research Reports   Web Events   Conferences  
 
RESOURCES FOR EVALUATING ENTERPRISE SEARCH TECHNOLOGIES
October 04, 2006

Table of Contents

Featured Content: Best of Both Worlds (Case Study on Enterprise Search in Practice)
First Call: Enterprise Search Summit -- Want to Speak?
Basis Technology Acquires Translingual Technologies
FAST offers new platform and SDK
Complete text analytics
Paid Search Has Slight Edge in Conversion Rates Over Organic Search, According to Study
KNOVA Releases KNOVA 7
ISYS Announces Arrival of ISYS 8
SiteCatalyst 13 from Omniture
Convera Launches New Product Platform to Combine Web and Enterprise Search
ISYS releases new search suite
ARUP Laboratories Selects ISYS:web for Web Search, Navigation, and Discovery

Featured Content: Best of Both Worlds (Case Study on Enterprise Search in Practice)

Organization: The Montague Institute www.montague.com
Vendor of Choice: Autonomy Ultraseek www.ultraseek.com


Click Here to Download Your Free PDF

At the Montague Institute, we make a living through our content. Unlike many organizations, we do not use a search engine to find a needle in the intranet haystack. Instead, we use it as one of several discovery options in an environment where authors and editors pay close attention to document selection, preparation, and metadata.

The Montague Institute publishes briefings, course books, and two periodicals: the Montague Institute Review and the Knowledge Base Editor's Digest. Our public website contains abstracts of articles in the Review, while the full text is available to members on a passworded website.

Institute content, on the web since 1995, has reference value for both the public and members. Many people continue to read articles that are several years old. Over time, vendor names change and new buzzwords appear. A part of our value added is to help readers navigate this changing landscape through cross references and definitions.

How to Give Readers the Best of Both Worlds
By 1999, we realized that both our authors and readers needed a better way to find articles on a specific topic, vendor, or concept. Our CFO, who remembered the value of indexes from his days as a Ph.D. candidate, suggested that we create a topical A-Z index for the Review and other website content. To do this, of course, we needed to develop a list of terms and associate them with web pages. As the index evolved, we added thesaurus terms (cross references) and definitions.

For some tasks, such as finding a known item or a unique word quickly, a search engine works better than an index. For awhile, we used the Google search code on our site, but soon we developed a list of must-have features that weren't available with the free Google box:

  • Create a list of "Best Bets"—editor-selected pages that display first in the list of search results;
  • Customize the search page—e.g., remove the Google logo and add a link to our A-Z index;
  • Customize the results page—bypass automatically created document summaries and use our own descriptions;
  • Search member content—add a members-only search box that could access the full text of articles on the passworded website;
  • Add topics and cross references—add "see also" references and a list of related topics to the search results.

Selecting a Search Engine
In looking for a replacement for the free Google search, we had three basic requirements:

  1. Leverage our existing metadata to make search more accurate and give users more discovery options.
  2. No programming required.
  3. Low cost.

When we went shopping in 2001, Ultraseek (then called Inktomi) was the obvious choice. We bought the Content Classification Engine (at that time an extra cost option) to provide a topic hierarchy, bread crumb trail, and related topic display in the search results.

Implementation
Installing Inktomi and getting it to crawl our content was relatively easy. Customizing the "look and feel" of the pages was more difficult because it involved changing the complex code of the Inktomi search and results pages. With the current version of Ultraseek, this task is much easier and can be done by making choices in a web-based "style editor." Few code changes are now required.

After the initial cosmetic changes, we turned our attention to customizing Ultraseek's behavior in the following ways:

Searching public and member content. We wanted nonmembers to find out that an article exists, but we did not want them to be able to access the full-text version of it. To do this, we configured two Ultraseek collections: one with article abstracts and all other content on our public website, and the other with full-text articles on a passworded site.

Reorganizing content. We reorganized the folders on our site to isolate the "wheat" (articles and substantive pages) from the "chaff" (nonessential pages like navigation links). Then we instructed Ultraseek to crawl only the folders containing substantive content.

Entering topics and "Best Bets." We entered rules that told Ultraseek how to select documents and "Best Bets" for each topic created by the Content Classification Engine (CCE). This task was made easier by the effort we had already put into selecting and classifying documents for the A-Z index.

Customizing search results. By default, Ultraseek displays a computed summary, the URL, file size, publisher, relevancy percentage, and a "find similar" link for each document it finds. Instead, we wanted to omit the file size, relevancy percentage and "find similar" link. Instead of the computed summary, we wanted a description composed by our editors. Instead of the publisher, we wanted a true publication date (not the last-modified date). This involved making sure both the description and publication date were entered as metadata elements in each document and then telling Ultraseek where to find them.

Entering document metadata. Until recently, Ultraseek looked for metadata within each document. That meant adding metadata tags and values to all the pages that we wanted to appear in the search results. At the time, we used a program called Metabot, which scanned each web page and put existing metadata into a spreadsheet format. Each web page was displayed as a row, and each metatag was displayed as a column. You could add new metatags (columns) and new values in each cell. When you saved your work, Metabot would automatically insert the metatags into the right document. Eventually, we learned how to do the same thing with a relational database. Today, Ultraseek can read metadata directly from an external database, obviating the need to insert tags into documents.

Creating a thesaurus file. Ultraseek can read a table of equivalents ("thesaurus") to expand a search. Typically this file is used for acronyms, spelling variations, and synonyms. We exported the "see also" terms from our A-Z index database in an XML format that Ultraseek can read.

Results
Internally, we like to use Ultraseek to find known items and articles containing specific products or terms. For example, it's much quicker to locate an article called "Best of both worlds" or find all articles that mention "Verity" using Ultraseek than using the A-Z index. However, when finding articles that we've published on a certain topic, such as "return on investment" or "search engines," the index is better. That's because unless a word is actually used in the text of an article, Ultraseek can't find it unless you create and update a rule for each keyword (in our case more than 600 keywords).

For internal use, we rarely use the Ultraseek topic hierarchy and thesaurus features. Instead, we rely on the A-Z index. But Ultraseek query reports indicate that topics are heavily used by the public. During the most recent 90-day period, for example, all but two of the top Ultraseek queries were topics (e.g., "Research & searching > research tools & techniques" or "Management tools and techniques").

Interestingly, the total number of Ultraseek queries during this period was almost identical to the total number of terms accessed in the A-Z index. In other words, website visitors used the index and the search box in equal numbers.

Once or twice a year, we run an Ultraseek search against our content collection using terms from the A-Z index. This can be automatically done using a database script. The result is a spreadsheet that shows all the items that Ultraseek found for each term. An editor reviews the list to see whether we've missed any documents that should be added to the index.

As search technology evolves, we will evaluate new products, but we will always offer our users the best of both worlds: full-text search and a topical A-Z index.

Download Your Free PDF

Back to Contents...

First Call: Enterprise Search Summit -- Want to Speak?

 

Call for Speakers

Enterprise Search Summit

SUBMIT YOUR PROPOSAL

[Deadline: November 15, 2006]

You are invited to submit a proposal to speak at Enterprise Search Summit 2007, May 15-16, 2007, in New York. We are seeking dynamic and knowledgeable speakers who are responsible for selecting, implementing, and managing enterprise search solutions within their organizations and who can creatively communicate their in-the-trenches know-how. Preference will be given to in-house search professionals and IT managers with hands-on experience and to enterprise search consultants and experts.

The emphasis for this year's Enterprise Search Summit is "beyond the basics," focusing on how enterprise search software and solutions really work inside organizations, going in-depth to the complex issues and problems that challenge experienced search managers.

Enterprise Search Summit is an intense, expert-led, 2-day learning experience that covers how to develop and implement — and enhance — cutting-edge internal search capabilities. It offers a structured opportunity for information managers and IT professionals to learn strategies and build the skill-sets needed to make the content that they are acquiring, publishing, organizing, and managing not only searchable, but "findable." It has become the most important conference of the year for the enterprise search industry and those who use enterprise and site search software and solutions.

Proposals should emphasize one of the following aspects of enterprise search and should be focused on practical solutions, not abstract or theoretical research:

  • Next-generation search: faceted navigation, entity extraction, contextual search, clustering, and visualization
  • Audio and video search
  • Tuning search (best bets, analytics and other tools and tactics to enhance search results)
  • Integration and back-end refinements
  • Taxonomies, metadata, and classification
  • Searching unstructured content — data mining and text analysis
  • Compliance requirements and search
  • Troubleshooting your search application
  • Best practices and lessons learned
  • Search and content management — did your CMS come with search and does it work?
  • Implementing search — what it takes to get search up and running
  • Selecting the right search engine — the choices and the decision process Upgrading your enterprise search engine
  • The ROI on search — how do you prove the value?
  • Case studies and real-world search

PLEASE BE VERY SPECIFIC about your topic! What exactly have you done? What is the most salient aspect of your work? Proposals that focus on a single facet of search project or on one phase of a search implementation have the best chances for acceptance. Do not submit a proposal to describe the entire life history of your project. Instead, choose an important part of the project and concentrate your proposal on it. What do you know/do best? Broad overviews of any of the above topics are not appropriate.

Conference attendees come to Enterprise Search Summit to learn about effective enterprise search tools and solutions, best practices, and success stories that they can adopt to meet their own challenges. Preference will be given to proposals for dynamic and stimulating sessions and to speakers who can effectively deliver specific details about their knowledge, experience, and expertise that attendees can use to implement better enterprise search solutions. Proposals from search vendors are more likely to be seriously considered if the proposal comes directly from a customer or is a customer case study where the customer is the primary speaker.

To submit a proposal, go to the Enterprise Search Summit Call for Speakers form. The deadline to submit proposals is November 15, 2006.

To see last year's program, click here.

Nancy Garman, Director, Conference Development, Information Today, Inc., ngarman@infotoday.com

w.in

Back to Contents...

Basis Technology Acquires Translingual Technologies

Basis Technology, a provider of enterprise software solutions for multilingual text retrieval and analysis, has acquired the intellectual property assets of Translingual Technologies LLC. As part of this transaction, Dr. Scott Miller has joined the firm as chief scientist. Translingual Technologies is a two-year-old Massachusetts startup, which has developed algorithms for making foreign-language content accessible to monolingual English speakers. Terms of the acquisition were not disclosed.

(www.basistech.com)  

Back to Contents...

FAST offers new platform and SDK

FAST Search and Transfer (FAST) recently launched the Personal Search Platform (PSP), which extends FAST's Enterprise Search Platform to individual desktops.

The offering includes FAST Personal Search, which locates information locally, across the enterprise or on the Web, as well as a software developer's kit (SDK). The SDK is an OEM-specific software development component that enables software providers to extend their solution functionality to the enterprise desktop. FAST PSP provides an enterprise-focused personal search platform for delivering branded solutions.

FAST says PSP enables organizations to expand personalized intelligence by connecting users to the local, enterprise and Web sources they need to find everything they want--desktop files, e-mails, premium content and favorite Web content--with just one query. With a higher degree of personalization, FAST PSP delivers results that match the perspective and needs of each user, while preserving security and information access standards.

The company explains PSP provides a 360-degree view of the enterprise and the desktop, thus eliminating what it calls "application hopping" and searching. Users can take action on items directly from FAST PSP, regardless of application, speeding work and eliminating delays as each result opens with full application functionality. Further, it says, PSP can be configured easily, personalized (for both the enterprise and the user), and deployed seamlessly throughout an organization within minutes. FAST adds that PSP adheres to industry-standard security protocols and policies, enabling users to see and search that content for which they are authorized.

Back to Contents...

Complete text analytics

Attensity has introduced what it says is the market's first complete text analytics suite. Attensity 4 includes new methods of searching, querying, charting and graphing freeform text dynamically in a browser-based interface.

Features include:

  • unified architecture--all applications are integrated and use common commands;
  • improved discovery and analytics--freeform text exploration has been significantly enhanced;
  • new text search offering embedded into workflow--search results can be viewed as links to documents ranked by relevancy, to hot spots of prevalent text strings inside documents, and to other relevant words;
  • a "manage" module--classifies users as either administrators, authors or read-only viewers for enhanced security;
  • "breadcrumbs"--a horizontal trail of links appears across the top of the page;
  • enhancements to the existing extraction engine library;
  • expansion of entity libraries in an atomic architecture; and
  • categorization for "nested" hierarchies to organize and report on fact analysis of text.

Attensity 4 works inside, alongside or outside an organization's current business intelligence applications. It will initially support MySQL, Microsoft SQL Server, the Teradata data warehouse and Oracle. The first operating systems supported will be Windows XP, Windows 2000 and Linux.

Back to Contents...

Paid Search Has Slight Edge in Conversion Rates Over Organic Search, According to Study

WebSideStory, Inc., a provider of digital marketing and analytics solutions, has announced the results of a new study that shows paid search has only a slight 9% edge in conversion rates over organic search. In a study of business-to-consumer (B2C) ecommerce sites during the first eight months of this year, paid search--keywords bought on a pay-per-click basis at search engines such as Google, Yahoo and MSN--had a median order conversion rate of 3.40% at business-to-consumer ecommerce sites using the company's HBX Analytics technology. This compared to a conversion rate of 3.13% for organic search results, defined as non-paid or natural search engine listings, during the same January-to-August timeframe, according to the WebSideStory Index, a compilation of ecommerce, site search, and global internet user trends. The study analyzed more than 57 million search engine visits. Order conversions occurred during the same session.

(www.websidestory.com)

Back to Contents...

KNOVA Releases KNOVA 7

KNOVA Software, Inc., a provider of Intelligent Customer Experience applications, has announced the general availability of KNOVA 7, the new version of the company's application suite. KNOVA 7 is highlighted by personalized Microsites, new actionable analytics, a new Visual Search Manager, and collaborative authoring.

Features of KNOVA 7 include: Multi-dimensional segmentation drives for personalized experiences based on user profile and user intent; Microsites designed to enable creation and management of distinct customer experiences based on brand, product, geographies, user roles, or other criteria; the Recommendation Manager for managing cross-sells, up-sells, and promotions and delivering targeted news, alerts, recommendations, and offers based on user profiles and user queries; and integration with any portal architecture, context-aware knowledge pagelets for site construction.

Search and navigation capabilities include: Visual Search Manager for drag-and-drop search experience optimization; Self-learning Adaptive Navigation powered by KNOVA's Cognitive Processor technology; search results dynamically adjust based on the success of other users; Collaborative authoring with re-usable information components, auto-classification, and natural content capture; Content presentation reflects reputation of individual documents and people; and embedded analytics provide insights into usage trends, root causes, knowledge gaps, and resolution success.

(www.knova.com)

Back to Contents...

ISYS Announces Arrival of ISYS 8

ISYS Search Software, a supplier of enterprise search solutions for business and government, has announced the arrival of ISYS 8, the company's suite for search, navigation, and discovery. The suite is designed to address the information access needs and is comprised of three core products: ISYS:desktop 8, ISYS:web 8, and ISYS:sdk 8.

Some of the new features are text mining, ediscovery, and expertise location via ISYS Entities; power and scalability through content caching, scripting, and federated search support; adherence to business rules via tuning and editorial controls such as Best Bets; full-featured search capabilities through enhanced ISYS SearchTrends, and content support.

(www.isys-search.com)

Back to Contents...

SiteCatalyst 13 from Omniture

Omniture reports it has developed a new set of Web 2.0 business porcess optimization (BPO) tools designed to increase the productivity and effectiveness of online business professionals.

SiteCatalyst 13 has packaged best-practices expertise and technology for:

  • social networking--analyze the value and relevance of user-generated content,
  • blogs--measure their consumption and influence,
  • rich Internet applications (RIA)--quantify rich-media engagement and abandonment,
  • dynamic site search--provide visitors with self-optimizing search results, and
  • visitor interaction profiling--one-to-one targeting with complete user profiles.
SiteCatalyst 13 also includes the Online Business Administration Console, key features of which are:

  • fast and accurate creation, configuration and management of thousands of report suites, including pre-configured suite templates tailored for specific industries and site types;
  • management of user access and permissions for individuals, groups and functions by roles and entitlements;
  • support for the deployment and management of multiple currencies and languages;
  • automatic generation of data collection code by application type, including Web pages, wireless devices and RIA; and
  • open access to external provisioning systems through a Web services API and software developer kit to automate all administrative functionality.

Back to Contents...

Convera Launches New Product Platform to Combine Web and Enterprise Search

Convera Corporation, a provider of search technologies for professional workers, has launched the TrueKnowledge Platform. The platform applies Convera advanced search technologies to enterprise and web environments, designed to enable individuals to access information both within their corporation and across the internet in a single view.

Through the unified TrueKnowledge Platform, Convera will deliver three solutions: Convera TrueKnowledge for Web--a hosted service that provides a customized search engine for content publishers serving professional markets. Convera TrueKnowledge for Discovery--a bundled hardware and software product deployed behind an organization's firewall designed to enable them to search billions of internal documents as well as information on the web; and Convera TrueKnowledge for Enterprise--a software-only enterprise search offering with the ability to incorporate selected web content. Convera is currently making TrueKnowledge for Discovery available to RetrievalWare customers who have large-scale and mixed enterprise and web search needs. Convera has also established a migration path to TrueKnowledge for Enterprise following the delivery of RetrievalWare 8.2 in the third quarter of this fiscal year.

Both TrueKnowledge for Discovery and TrueKnowledge for Enterprise will support RetrievalWare intellectual property. Stored queries, profiles and taxonomies built and deployed on RetrievalWare can be migrated to the new platform. Convera customers under maintenance contracts will be upgraded to the TrueKnowledge for Enterprise software solution at no additional expense. Optional functionality delivered as part of TrueKnowledge for Enterprise will carry an additional cost. TrueKnowledge for Web and TrueKnowledge for Discovery are available immediately. Convera plans to release TrueKnowledge for Enterprise in 2007.

(www.convera.com)

Back to Contents...

ISYS releases new search suite

ISYS Search Software has introduced a significant new version of its search software suite. ISYS 8 includes three major components to deliver the entire range of search functionality: ISYS:desktop 8, ISYS:web 8 and ISYS:sdk 8.

The company reports that new features in the suite include:

  • ISYS Entities for text mining, e-discovery and expertise location;
  • content caching, scripting and federated search reporting for improved power and scalability;
  • editorial controls and tuning to facilitate adherence to business rules;
  • deep search analytics through enhanced ISYS SearchTrends; and
  • broader content support, including Microsoft SharePoint and Office 2007.

ISYS says the entity feature is important to Version 8 because it automatically extracts and displays the "who, what and where" search, allowing users to understand the context of their results and the connections and associations present between their search terms and the content. It also enables users to drill down, locate topic experts and discover information they might not have known existed. The entity feature is even offered in ISYS:desktop, making it the first desktop search application to offer automatic entity extraction, says ISYS.

The company further highlights the following capabilities found through the suite:

Best Bets--designed to enable administrators to ensure that a specific document appears as the first result when given a certain query.

ISYS Federator--the ability to federate queries across remote indexes, including content collections from multiple ISYS:web servers as if those indexes were maintained locally.

Microsoft SharePoint--out-of-the-box support for Windows SharePoint Services and SharePoint Portal Server, enabling organizations to include SharePoint content in their ISYS indexes while adhering to SharePoint's security settings.

Enhanced SearchTrends--gives organizations the ability to analyze search behavior and modify search features to better cater to their users.

Back to Contents...

ARUP Laboratories Selects ISYS:web for Web Search, Navigation, and Discovery

ISYS Search Software, a global provider of enterprise search solutions for business and government, has announced it was selected by ARUP Laboratories, a national clinical and anatomic pathology reference laboratory and a provider of laboratory research and development, for advanced web search functionality on its public website. ARUP Laboratories, an enterprise of the University of Utah and its Department of Pathology, provides clinical and research information for its clients, offering more than 2,000 tests and test combinations, along with a Clinical Guide to Laboratory Medicine. ISYS:web was selected to power the search functionality on ARUP's website and to provide visitors with access to its content repository.

(www.isys-search.com; www.aruplab.com)

Back to Contents...
 
[Newsletters] [Home]

Problems with this site? Please contact the webmaster. | About ITI | Privacy Policy