EnterpriseSearchCenter.com Home
  News   Features   White Papers   Research Reports   Web Events   Conferences  
 
RESOURCES FOR EVALUATING ENTERPRISE SEARCH TECHNOLOGIES
April 16, 2008

Table of Contents

Your Users Are Talking to You -- A Look at the 100 Top Search Terms
IBM Upgrades Enterprise Search Software
Google Closes Acquisition of DoubleClick
Vivisimo Secures $4 Million in First Institutional Funding Round
SAS Acquires Teragram
Northern Light Incorporates Social Search Principles Into Online Business Research Tool
AccessData Releases AccessData eDiscovery
Warding off insider threats
OEM-ing with Exalead
Atempo Announces Launch of Next-Generation Digital Archiving Solution

Your Users Are Talking to You -- A Look at the 100 Top Search Terms

If you have a search engine, you should have search logs. And if you have search logs, you should have access to a periodic report on the most frequently used search terms. (Some older search engines have extremely limited logging and reporting, but you can write scripts or import them into databases and generate search reports that way.)  All the search term frequency reports I’ve seen fall into the same pattern, with a high number of queries for frequent search terms quickly trailing off into unique terms. I have found that the top 100 search terms provide a good idea of what people are looking for with your search engine. (Note that you do need a statistically significant number of search terms to make this work, so if your search only gets a few thousand searches a week, do your analysis over a month or two.)  

This curve may look familiar—it’s the Long Tail, as pop­ularized by Chris Anderson’s book of the same name. The curve is more shallow for online bookstores and other sites with a high percentage of "known item"searches (such as title or author), while informational and research search engines have a steeper curve and higher proportion of unique search terms. Infrequent terms demonstrate the point at which a free-text search engine—with dynamic auto­matic indexing system and content-independent relevance ranking—shines. As Anderson points out, search provides customer access to market niches, explaining much of the success of Amazon and Netflix, among others.

The Long Tail is interesting, and there are many things to learnf rom it. However, for search analysis, the place to start is the "Short Head": the most common search terms, used by hundreds of people on your search engine. While many top search terms are single words, a high proportion tend to be phrases, where the user is looking for a specific thing and is willing to type its entire name. This indicates that the search engine relevance heuristics should recognize phrase matches and rank documents with phrase matches higher in the results lists than nonphrase word matches.

The quality of the search results for the most popular terms will vary, depending not just on the quality of the relevance algorithm but also on the similarity of the user vocabulary to the content vocabulary, the content itself, the navigation and site or portal design, and the sophistication of the searchers. If you replicate the top searches and examine the results pages, you will see where the results seem generally useful and where they are confusing, allowing you to focus improvements on common and problematic situations.

The Value of Vocabulary

You may be surprised by your employees’ search vocabulary. The most frequent search terms tend to be the diametric opposite of marketing orinternal organi­zational terms: They’re more likely to be industry jargon or common forms, such as "HR" instead of "Staff Services." Sometimes the terms they search on are products or acronyms for projects that are completely obsolete, yet people still come to the search engine looking for them. Understanding the terms your user group searches provides opportunities for content creators to add more user-friendly vocabulary to the pages. This is a form of search engine optimization that is particularly important in public-facing sites, but that can be deployed inside organizations as well. You can set the search engine to automatically expand the search to include the standardized term or display Search Suggestions (sometimes called "Best Bets") which can provide information on the current status of a given piece ofinformation, and link to landing pages, internal or external, for more help.

In addition to content and search, information architecture and taxonomies can derive great advantage from user vocabularies. With this information,taxonomy designers can make their labels and categories easier for users to under­stand (or at least add synonyms and cross-reference entries). User vocabulary can also provide insight for marketers and fans of organizational charts, who mistakenly assume that their mental models and vocabularies are shared by end users.

Navigational Searches

Many search terms may be navigational searches, even for topics that are displayed in big red letters on the home page. I’ve found that about a third of site visitors simply prefer searching to clicking, but in other cases finding your top-level navigational items among search terms may be a symptom of problems with the navigational links. Examples include situations where the corporate postal forms are hidden in a buried subdirectory, or the link to the worldwide help desk is shown as WWHD. Knowing there’s a problem allows the site designer to fix the navigation, to the ultimate benefit of the users.


Download the Complete PDF.





Users May Be Dazed andConfused

If your search pages aren’t consistent with your overall design, some site visitors may think your search engine is Google, employees may mix up your intranet portal with the web, and you’ll find they search for terms like"sex" and "mp3." It’s always a good idea to make your search results page fit the same look and feel as your intranet or site, with the same colors, graphic design elements, and basic layout.

Content and Search Suggestions

Some of the top search terms may be for topics not covered within your intranet or website. Adding and adjusting content, from changing a web page title tag to creating a new topic to indexing a large data source, can often solve significant information needs.

In other cases, the content is there, but it just isn’t coming up in the first few items on the results list. For these popular search terms, most modern search engines provide a means to create manual search suggestions, using human judgment to direct people to the most useful pages for these topics. While the search engine is dynamic and automated—great for the Long Tail—search suggestions are a great way to deal with cases where you know what the right answer is for frequent queries.

The top 100 terms that were searched on an engine can be enlightening, surprising, and, yes, even depressing. But they can help improve your search engine, and with some cooperation from content providers, they can also significantly improve corporate sites, portals, and intranets
 

Download the FREE PDF.

About the Author 

AVI RAPPOPORT (consult7@searchtools.com)is a leading authority on enterprise search engines for websites, intranets,and topical portals. She is the founder of Search Tools Consulting (www.searchtools.com).


Back to Contents...

IBM Upgrades Enterprise Search Software

IBM introduced a new version of its OmniFind enterprise search software designed to help users quickly and easily find, manipulate, and share information across their business or organization. IBM OmniFind Enterprise Edition 8.5 search software features enhancements that are designed to help customers gain insight and value from their business information, which is the aim of IBM's cross-company Information on Demand strategy. The OmniFind advancements make the enterprise search engine able to support the latest Lotus collaboration and social software allowing early adopters of Lotus tools such as Lotus Quickr and Lotus Connections to further improve productivity, business networking, and knowledge sharing. The new version also includes an interface that refines and graphically displays relevant search results; full global language support for Japanese, Chinese, and Korean; and support for the latest versions of Red Hat Linux, Windows Server, IBM FileNet Enterprise Content Management software, and the IBM Lotus Collaboration Suite.

(www.ibm.com)

Back to Contents...

Google Closes Acquisition of DoubleClick

Google Inc. announced that it has completed its acquisition of DoubleClick, a company that offers online ad serving and management technology to advertisers, web publishers, and ad agencies.

(www.google.com, www.doubleclick.com)
 

Back to Contents...

Vivisimo Secures $4 Million in First Institutional Funding Round

Vivisimo, a provider of enterprise search software and expertise, announced the completion of its first institutional investment round of $4 million USD funded by Portland, ME based North Atlantic Capital. The financing will support Vivisimo's continued global expansion. North Atlantic Capital is an expansion stage venture capital firm investing primarily in tech-enabled business service, information technology and communications companies.

(www.vivisimo.com, www.northatlanticcapital.com)
 

Back to Contents...

SAS Acquires Teragram

SAS, a provider of business intelligence (BI) and advanced analytics, announced the acquisition of privately held Teragram, a provider of natural language processing (NLP) and advanced linguistic technology. The acquisition is intended to enhance SAS' own text mining and analytical BI offerings, and extend them to enterprise and mobile search.
With the addition of more resources from SAS, Teragram's existing customers and OEM partners will see enhanced R&D and support from Teragram, which will operate as a SAS company.
 
(www.teragram.com, www.sas.com)
 

Back to Contents...

Northern Light Incorporates Social Search Principles Into Online Business Research Tool

Northern Light’s CEO, David Seuss, sees social search as a beacon on the horizon. It’s still relatively faint, but with his company’s help, it’s growing ever brighter. "We want to bring a social element to business research in a big way," Seuss says. "We think there is a lot of stuff that goes on in the social computing world that is irrelevant and distracting in a business setting. But some of those concepts make sense in a research setting."

With the principles of social search in mind, Northern Light has lately been tinkering with a new product, Northern Light Search (www.nlsearch.com), a freely available online business research tool. Officially launched this month, Northern Light Search provides access to thoroughly vetted business and industry news from thousands of hand-selected business news sites, leading business publications, industry authority blogs, regional newspapers, and national news sources. Users can search, browse, and analyze content, personalize the site, set search defaults, and create and subscribe to alerts.

"NLSearch is for serious use on business issues and business problems," Seuss says. "We believe the lessons we’re learning from the enterprise portal business apply here. Researchers should be able to leverage the knowledge of those researchers whose interests are the same as their own. A lot of Web 2.0 products miss that link. When a search engine retrieves the most popular search results, that’s good in a general way, but it’s not specific to me. What I need are genuine collaborators. I need someone who knows something about the specific topic I’m searching. I want to be able to leverage what they’ve already learned."

NLSearch links to the Northern Light Market Intelligence Wiki, which provides an overview of industries and business trends, with information about market segments, issues, companies, and regulatory actions. "We use the same engine that powers Wikipedia," Seuss says. "We have focused on market intelligence and research, and we employ an editorial staff. We’ve seeded the wiki with very good material that will get it off the ground. The hope is that users will add their own material, and that this will become another valuable resource." Seuss says that other social applications are in the works for NLSearch, such as tag clouds and autocompletion.

The search engine also deploys a trademarked text analytics tool called MI Analyst. The analyst combines the capabilities of Northern Light’s free-text searching with advanced text analytics that were developed specifically for market intelligence applications. MI Analyst includes entity extraction for facets such as companies, government agencies, IT technologies and markets, job titles, business issues, strategic scenarios, and sources.

MI Analyst allows users to see a summary sentiment score (i.e., whether the document author’s tone is positive or negative) for each company and sort search results according to sentiment. MI Analyst also allows users to uncover the relationships between entities revealed by the document set, mitigating the research barrier of unfamiliar or unidentified concepts, synonyms, acronyms, and aliases. Users can also perform trend analysis with display and data export options.

Seuss says that NLSearch’s primary market consists of small companies and independent researchers that are interested in online business research but haven’t bought in to the high-priced content aggregators out there. "Small and medium businesses are the ones we’re reaching out to," he says. "The more narrow focus on business content, as opposed to the entire universe that Google News would give is one piece of value, and overlaying the meaning extraction capabilities is a second dimension of that value."

(www.nlsearch.com)

Back to Contents...

AccessData Releases AccessData eDiscovery

AccessData announced the release of AccessData (AD) eDiscovery, designed to be an automated, in-house eDiscovery solution. It is designed to address all data equally, enabling organizations to identify, preserve, and process data from desktops, servers, popular data repositories, databases, and email. In addition, AD eDiscovery can be used by anybody from paralegals to highly technical IT personnel. AD eDiscovery maps to the Electronic Discovery Reference Model, virtually walking an organization through the eDiscovery process. Some of the key features of AccessData eDiscovery include the following: Web-driven solution allows anybody to work with the technology; Wizard-driven workflow streamlines the collection process by managing custodians, data sources, and collections across matters and time; and Forensically collects data from workstations, laptops, network shares, email servers, Documentum, Sharepoint, Open Text, and databases.

(www.accessdata.com)
 

Back to Contents...

Warding off insider threats

Raytheon Oakley Systems has chosen e-discovery software from Kazeon Systems as part of its insider threat solution.

Raytheon Oakley, a division of Raytheon Security, will use Kazeon's Information Server to search and classify sensitive data, such as social security numbers, within an organization's network, and to store the data securely. Raytheon Oakley then creates a footprint for the data, which allows it to be monitored to prevent insider threats of theft and data leaks, according to a press release from Kazeon.

Derek Smith, president of Raytheon Oakley Systems, says the combined solution covers data at rest or in motion across the network and endpoints. "Our customers are increasingly demanding electronic data discovery as part of their security portfolio. They know it's impossible to protect critical data if they don't know what it is or where it is in their enterprise," Smith says.

If uninformed users expose sensitive data, Raytheon Oakley can detect the policy violation, notify the user of the policy and report the violation. When a malicious insider tries to steal intellectual property or customer data, Raytheon Oakley can prevent the activity, alert the appropriate investigator, prioritize the incident by severity and capture visual evidence of the crime, according to the news release.

Raytheon Oakley is a reseller of Kazeon Information Server.

Back to Contents...

OEM-ing with Exalead

Exalead has announced a new original equipment manufacturer (OEM) and reseller agreement with Atempo, a provider of cross-platform data protection and archiving solutions.

The deal calls for Atempo to integrate Exalead one:enterprise OEM Edition, including Exalead's full content index and search capabilities, into the Atempo Digital Archive (ADA), a file-based enterprise archiving platform designed to optimize the archiving and retrieval process for business critical files, directories and unstructured data types.

Further, Atempo will resell Exalead one:desktop, Exalead's fully unified enterprise desktop search solution that allows users to search for information regardless of format or location, including multimedia files, e-mails, contacts, tasks, notes and other documents.

Back to Contents...

Atempo Announces Launch of Next-Generation Digital Archiving Solution

Atempo, Inc., a provider of cross-platform data protection and archiving solutions, introduced the new version of its file archiving software solution, Atempo Digital Archive (ADA) 2.0. The software is designed to simplify the process of long-term data retention for mid-market and larger organizations and offers new features like file de-duplication, full content indexing, and search capabilities, and a utility that identifies inactive, fixed content data ready for archive. Atempo Digital Archive is a hierarchical storage management (HSM) solution that provides the option for user-initiated archiving through an easy-to-use interface. This feature gives end-users the ability to drag and drop files into pre-defined archives as needed, and lets them customize their archiving policies to match specific project needs within organizations.

(www.atempo.com)
 

Back to Contents...
 
[Newsletters] [Home]

Problems with this site? Please contact the webmaster. | About ITI | Privacy Policy