EnterpriseSearchCenter.com Home
  News   Features   White Papers   Research Reports   Web Events   Conferences  
 
RESOURCES FOR EVALUATING ENTERPRISE SEARCH TECHNOLOGIES
August 09, 2006

Table of Contents

Featured Content--Search: The Quiet Revolution
New This Week in the Demo Center
Inxight and Visual Analytics Partner for Government Ops
Visualize unstructured data
Deep Web diving
InfoSearchMedia Announces Alliance with WSI
Optimizing DAM workflow
ZoomInfo Offers Desktop Access; Selected by AOL for Search
NTT DoCoMo Launches Keyword Search Service

Featured Content--Search: The Quiet Revolution

Click here to download your free PDF of this Enterprise Search Center exclusive article.

Search technology has been around for more than four decades years. But only in the past ten years, as the WorldWide Web has become an integral part of the technology landscape, has it occupied a prominent place in our work and personal lives. And only in the past four years has search finally become a hot and lucrative area of technology development. Why the delay?

First, search technologies require computing power to sort through massive amounts of text. We are finally at that point, even with our desktop machines.

Second, good search requires language understanding. Language is often complex and ambiguous, so it's no trivial problem to figure out what a document is about. Once computing power was no longer the barrier, simple string matching could be replaced by elegant language analysis and complex matching algorithms. That change is transforming not only today's search engines, but also other software applications.

Third, there was little demand for this degree of ele-tions to find their "stuff." When companies discovered that lost information put them at risk for noncompliance, costly lawsuits, product flaws, or poor decisions, then information access became a priority. Fear can induce new interest in technologies that were once quiet backwaters.

Fourth, the web is vast, and search is its entry point. The web has made search mainstream.

Fifth, and most important, our lives have changed during the past decade. Our work and personal lives have become intertwined. We check our email on our PDAs during our kids' soccer games, interrupt work to cheer a goal, then search the web for local fast-food restaurants after the game—and download a map to get there. At work, we send flowers to our mother for her birthday in the midst of trying to meet a deadline. We use the same devices and expect to use the same tools for all our tasks. Why not? This means that we need tools to support both work and personal tasks, and that they must look the same, even if the requirements for security, search and discovery, and communication are different underneath.  

So now we have demand, we have technologies, and we have computing power. Can we get the kind of information access that we need? Not quite yet. But the elements for driving development of better information platforms are finally in place. What is lacking is deep understanding of information interactions, and how to automate them effectively. The first barrier is how to divine meaning in both a query and a document. Understanding language is not so easy. Those of you who have read "Amelia Bedelia" books to your kids know that "waiting for the bread to rise" is quite different from "getting a rise" out of someone. The fact is that language is complex and ambiguous. It is also the foundation for most human interactions. In cyberspace, it is the way most of us interact with a computer.

We need to translate the following steps in a normal human information exchange into a human-computer interaction:

1. Question: I ask you a question: "Can you tell me where Winter is?"

2. Remove ambiguity: Knowing that it is July and that I am driving a car and look lost, you ask, "Do you mean Winter Street or Winter Drive?" You never even think that I might be asking about winter the season.

3. Refine and narrow: "Winter Street, in Weston."

4. Answer, in the context of where we are at the moment: "Three blocks up and on the left."

Now, if we try to translate this simple four-step interaction to querying a search engine, how far do we get? In today's—or actually yesterday's—search technology (what we might call basic, or commodity, search), we are no no farther than step 1: we ask a question, and get back some matches with no further interaction, no removal of ambiguity, and little chance to refine the question. In other words, there is none of the human give and take that we expect when we ask a human a question. Instead, we are reduced to making guesses about how the search engine might have matched our query. This guessing game is challenging, but not very rewarding. And since a search is rarely performed as an end in itself, it wastes time and also retrieves less-than-optimum answers.

Search has become the starting point for many tasks because it is the only way to get at the information that we need. Humans are adaptable, and they will use whatever tools are available, even if they are imperfect. But better, more human-like interaction is crucial as computers become even more pervasive. Finding information is part of many business processes and personal tasks. Buying a product, getting directions or phone numbers, conducting research, and monitoring events all require some sort of search. If we think of search as a beginning, rather than an end in itself, then a basic search box with no further interaction is not enough. Search technologies are the only ones that try to divine the meaning of words. They have the potential to make computers conversational and interactive. To get to that point, though, we will have to add technologies that enable computers to:

1. Understand the meanings of words, phrases, sentences, and documents.

2. Interact intelligently: ask questions to remove ambiguity, without being perceived as stupid.

3. Find the useful clues that determine context, as a person would.

4. Unite all the information to which we need access so that we don't have to repeat the process in multiple applications or on multiple sites.

5. Add understanding for non-language information—images, sounds, and possibly gestures.

In other words, the next incarnation of search has to do more than dump possibly relevant documents—or worse, pointers to documents—onto the desktop.

To improve search, we must mimic how people exchange information. Because people ask terrible questions, we must find a way to use other clues—their gestures, tone of voice, location, and what we know about them—in order to flesh out the intention behind their query. That's the next great challenge for search and discovery tools.

We are well on the way to adding language understanding, the first requirement. Most enterprise search engines have added technologies like categorization; identification of names of people, places, and things (entity extraction); and even sentiment extraction, which determines whether the tone of a document is positive or negative about a specific topic or thing. In the chart, note how, on top of basic search, we have layered technologies for understanding and mining language, images, and data. Each successive technology layer adds features that enable more advanced finding and exploration of information.

Outlook for Search and Retrieval

Not so long ago, content was king. In 2006, context will reign. This change comes from the realization that we will never get people to ask the right questions. Instead, we must seek other clues to what they are seeking. In addition, for text-mining applications such as product early warning, compliance, legal discovery, or sentiment analysis, the surrounding content provides a context for data, or for an otherwise ambiguous statement. Context is also gathered in new access applications from the framework in which a question is asked or from the type of task that is in process. So, geolocation added to search lets a wireless company return contextually relevant lists of movies or restaurants. A user's role added to search indicates the types of information needed. Recent search history or personal information provide additional clues.

Certainly, context is the reason that the advertising revenue for web search has grown so quickly. Context is used in web search in order to return ads that are relevant to the query terms in the search. Expect that new information-access applications will wring out every implicit clue that they can to improve search results.

Some of the other trends we note in the search and retrieval market include:

• Meaning-based computing. IDC believes that eventually these technologies—text mining, text analytics, categorization, speech analysis, and translation—will be embedded in most people-facing applications in order to improve human-computer interaction. This represents a sizable opportunity for vendors in this market.

• Browsing, in addition to searching.Presenting browsable results promotes discovery of the unexpected, and it makes search results instantly understandable. Master data management projects should also consider using categorization technologies to bootstrap mapping among diverse schemas.

• Question answering instead of search to provide automatic technical and customer support. Technical support sites are improved when question answering returns answers instead of just a list of documents that contain the right answer.

• Sentiment extraction for monitoring consumer opinion of products, services and people.

• Spam reduction.

• Email monitoring for compliance and/or for email management and mining. Email is a prime source for finding product ideas or for discovering telltale indicators of corporate misbehavior.

• Compliance monitoring of all text and speech regulations.

• Convergence of database and content technologies. Newer search architectures began offering "BI Lite" starting in 2006. Master data management, a trend still largely confined to the database side of the information divide, will gradually extend to the content side. Content technologies have a long history of information normalization: taxonomies, categorization, and controlled vocabularies all predate computers. Using content technologies to categorize and normalize schemas is a logical step that will unite structured and unstructured information and streamline the master data management process, which is still largely manual.

In the next five years, search will become even more pervasive. Questions and answers are part of most human tasks. These technologies will be embedded in cell phones, cars, gas pumps, home entertainment centers, call centers, and transit systems. Good search today requires more than keyword matching. Now, and in the future, search and discovery tools will be expected to help people locate, explore, search, compare, analyze, cluster, categorize, differentiate, aggregate, synthesize, and sort information of all sorts. That will require more than a search engine—it demands a true information discovery platform.

SUSAN FELDMAN Vice President, Content Technologies Research, IDC.

Click here to download your free PDF of this Enterprise Search Center exclusive article.

Back to Contents...

New This Week in the Demo Center

Groxis presents a demonstration of its visualization search software Grokker. For this demo Groxis lets you "grok" a federated collection of open Web sources (Yahoo!, Wikipedia, and Amazon Books). Try searching for your company, someone you know, or virtually any topic of interest, then check out the map view. Groxis makes the 10th vendor to join our demo center.

Other demos available at this time are: Siderean, exalead, Vivisimo, Coveo, Thunderstone, Synomia, Northern Light, Mondosoft, and Isys. Check them all out at the Enterprise Search Center Web site. Just click on the Demo Center navigation bar item or on the Featured Demo logos on the Home Page.

Back to Contents...

Inxight and Visual Analytics Partner for Government Ops

Industry leaders Inxight Federal Systems, a provider of federated search, extraction, and visualization solutions for government organizations, and Visual Analytics, Inc. (VAI), a provider of interactive visual analysis and information sharing technologies, have announced that Inxight's entity and fact extraction technologies will be integrated with the Digital Information Gateway (DIG) Symphony platform and VisuaLinks from Visual Analytics to visualize the tagged data and relationships. Combining Inxight ThingFinder Professional with VisuaLinks and DIG Symphony products is designed to enable analysts to understand relationships and events more effectively.

Inxight ThingFinder transforms unstructured text by identifying the critical entities, such as people, places, organizations, weapons, currencies, phone numbers, license plates, vehicles, and so on. It also extracts facts and events involving these entities, such as travel events, purchase events, and member-of relationships. DIG Symphony and VisuaLinks platforms from VAI then operate on these entities, enabling investigators and analysts to turn volumes of disparate data into actionable intelligence. DIG Symphony offers an integrated methodology for managing data processing and workflow operations for individual projects or across an enterprise. DIG Symphony provides data manipulation capabilities that include data transformations, data standardization, multi-source integration, entity resolution, and scoring/ranking of results. VisuaLinks is a platform-independent (Java-based), graphical analysis tool used to discover patterns, trends, associations, hidden networks, and non-obvious relationships across multiple data sources.

(www.inxightfedsys.com; www.visualanalytics.com)

Back to Contents...

Visualize unstructured data

Inxight Federal Systems, a wholly owned subsidiary of Inxight, and Visual Analytics (VAI) announce that Inxight's ThingFinder fact extraction software will be integrated with VAI's VisuaLinks and its Digital Information Gateway (DIG) Symphony to view tagged data and relationships. The two companies say the combination of ThingFinder, VisuaLinks and DIG Symphony will enable analysts to understand relationships and events more effectively than other solutions in the marketplace.

Inxight says ThingFinder transforms unstructured text by identifying important entities, such as people, places, organizations, weapons, currencies, phone numbers, license plates, vehicles, etc. Further, it adds, it also extracts facts and events involving these entities, such as travel events, purchase events and "member-of" relationships. DIG Symphony and VisuaLinks platforms then operate on these entities to bring clarity to investigations and intelligence analysis, enabling investigators and analysts to turn large volumes of disparate data into actionable intelligence, says VAI.

Back to Contents...

Deep Web diving

QL2 Software and TEMIS have formed an alliance to deliver a mutually complementary system for creating industry-specific and application-oriented data analysis reports from what they call locked and hidden content.

QL2 explains its WebQL 3.0 platform automates information extraction from the Web, as well as from other unstructured data sources, and then reformats it into structured and actionable formats. WebQL gathers information from both inside and outside firewalls, as well as monitors password-protected sources (such as newswires, trade journals, e-mail repositories, Web sites and blogs) and then displays the relevant data. With the ability to extract and integrate data from virtually any source, QL2 claims, the product gives the enterprise seamless access to any type of information to further extend business and competitive intelligence, enterprise search, text analytics and other business operations solutions.

TEMIS says its Online Miner 3.2 is a text analytics and discovery solution that enhances the use of information by extracting documents‚ key concepts and their meanings for automatic classification and the discovery of new relationships and associations. It also provides a number of graphical options, enabling results visualization and navigation within the discovered content and allows faster access to relevant information. TEMIS further believes Online Miner plays a critical role in fields where information processing is complex due to the great volume of data, such as in competitive intelligence, customer relationship management, scientific intelligence and reputation management.

Back to Contents...

InfoSearchMedia Announces Alliance with WSI

InfoSearch Media, Inc., an online content provider, has announced an alliance between the company's ContentLogic division and an Internet consulting group, WSI (We Simplify the Internet). Under this new agreement, all WSI Consultants will offer WSI WEB WORDS, utilizing ContentLogic's search engine optimization services, to their customers. The new WEB WORDS product includes identification of key phrases and writing customized web content in order to improve user experience and generate higher search engine rankings. Through the partnership, WSI Consultants will supply their customers with key phrases and content to optimize the organic search results.

(www.infosearchmedia.com; www.wsiconsultants.com)

Back to Contents...

Optimizing DAM workflow

Canto has released the latest version of its digital asset management (DAM) software, which has been specifically designed to integrate into creative workflow. Cumulus Version 7 is also said to offer significant performance gains and enhancements for security, Web-based DAM, automation and usability. The company further cites new features to promote increased asset control, faster production and more efficient collaboration.

The company adds that all Cumulus solutions now offer the ability to identify and manage the relationships between assets: For example, all the images referenced in QuarkXPress or InDesign layouts, pages contained in MS Word documents and even PowerPoint slides in presentations can be virtually connected to the assets that reference them. Cumulus 7 Enterprise can even catalog referenced files automatically when the layout files are cataloged.

Another new feature of Cumulus 7, Shared Collections, reduces search times and eliminates the risk of co-workers and clients using the wrong assets, as they immediately only see the asset records related to their current projects, without any searching. Users can now also label and rate asset records with visual tags that make it easy to see which assets have been approved, rejected, put on hold and which are new.

Other enhancements include "Actions" to automate workflow steps, new image conversion and processing features, creation of PPT presentations and printing improvements. Further, says Canto, Web services integration enables Cumulus 7 to become a standards-compliant participant, with minimal development efforts, in complex, Web-based application infrastructures.

Back to Contents...

ZoomInfo Offers Desktop Access; Selected by AOL for Search

Zoom Information, Inc., a search engine for discovering people, companies, and relationships, has announced the general availability of its Tools and Developer Resources Center. ZoomInfo will deliver its people and company information to users whenever and wherever they need it. These integrations are immediately available for download at the website.

AOL has announced that ZoomInfo was chosen to power the people and company search for AOL's new business-oriented instant messaging offering, the AIM Pro service. AIM Pro is free to all Internet users and can be downloaded from ZoomInfo's Resources Center or directly from AOL. AIM Pro users looking for background information on a new acquaintance, business partner, or vendor can find what they need at the base of the AIM Pro Buddy List feature. By selecting the People Search icon, users have access to ZoomInfo's index of over 31 million business people and two million companies.

The ZoomInfo Tools and Developers Resources Center also includes downloads for Firefox and Intellext. As ZoomInfo continues to grow and focus on providing valuable content to its users, the Tools and Developers Resources Center will continue to provide the most up-to-date list of these services.

(www.zoominfo.com/tools; www.aol.com/aimpro)  

Back to Contents...

NTT DoCoMo Launches Keyword Search Service

NTT DoCoMo, Inc., Japan's mobile communications company and its eight subsidiaries have launched a keyword search service that will enable users to perform searches from the Japanese iMenu portal for access to i-mode sites. The keyword searches will be free for all i-mode users. The service will also provide access to information from websites other than i-mode.

(www.nttdocomo.com)

Back to Contents...
 
[Newsletters] [Home]

Problems with this site? Please contact the webmaster. | About ITI | Privacy Policy