EnterpriseSearchCenter.com Home
  News   Features   White Papers   Research Reports   Web Events   Conferences  
 
RESOURCES FOR EVALUATING ENTERPRISE SEARCH TECHNOLOGIES
February 18, 2009

Table of Contents

Learning about Google via Google
Searching for answers naturally
Intriguing integration
User-based relevance
Exalead Announces CloudView OEM Edition 5.0
Mark Logic Releases MarkMail 2.0
BP Logix Announces Workflow Director 2.0
MadCap Software Launches New Version of Feedback Server
Higher visibility for manufacturing
Managing multiple worldwide sites
Milabra Secures $1.4 M, Announces Image and Video Recognition Engine
Open Text Extends eDiscovery Offering
Amazon and Google Add E-Books to Phones

Learning about Google via Google

A visit to a neighbor’s house last week taught me something about Google’s demographic attack on the TV establishment. A 12-year-old had a laptop with YouTube.com videos and an instant messenger running. She also had a mobile phone and was watching text messages arrive from her friends with a quick glance. In the background, without the sound on, was a cable TV show.

I asked her if she could multitask. She told me, "No, I’m not multitasking yet. I will when I start my homework."Apparently YouTube (purchased by Google in late 2006) provides entertainment for millions of middle school kids. So that begs the question: What can one find on Google about Google, a service that seems to fit seamlessly into the lives of 12-year-old girls on a school night?

Quite a bit, it turns out. If you want to know about the latest ideas that Google employees are monitoring, look no farther than Google Video and YouTube. Here’s an example. Do you want to learn about Google’s BigTable technology? Click here: videoplay?docid=7278544055668715642. You can watch SuperGoogler Jeff Dean explain how Google approached traditional database problems. The answer, as it turns out, was to not use traditional database technology. Google invented BigTable, and you can download the Apache Foundation’s open source version of it here (http://hadoop.apache.org/hbase).

What if you want to hear Sergey Brin talking about search in August 2007? No problem. Click here (youtube.com/watch?v=Ka9IwHvkfU).

Do you want to sit in on Google’s on-campus lecture series? Again, the data are available. Just navigate to Google Video http://www.video.google.com and enter the search phrase "Google lecture series" or some variation and begin exploring.

Google makes an enormous amount of information about itself available. Let’s take a quick look at several sources of information. Please explore these links.

1. If you’re a Google application aficionado, here’s a resource for you. The Google Friends Newsletter (google.com/contact/newsletter.
html), delivered to your inbox monthly, gives you Google-specific info like new feature and update announcements, usage tips and Google company notes. For example, the September newsletter (http://groups.google.com/group/google-friends/browse_thread/thread/6452afbdd1707c0d) announced the Moderator program (http://moderator.appspot.com) and the launch of a charity project (project10tothe100.com); updates to the Picasa (http://picasa.google.com) photo album, the Chrome () browser, and Google Maps (http://maps.google.com); and a tidbit marking the GOOG’s 10-year anniversary that month.

You can even look back at old newsletters in the Google Friends archive (google.com/googlefriends/archive.html) to locate past release information and trace the evolution of Google products.

2. Want to know what Google knows? Take a look at Google Labs white papers (http://research.google.com/pubs/papers.html), technical papers written by their people. A partial list of topics: artificial intelligence and data mining; audio, video and image processing; human-computer interaction and visualization; machine learning; software engineering; and much, much more.

3. Google also offers video functions for businesses (google.com/apps/intl/en/business/collaboration.html). Companies can use that functionality to share videos in a secure environment hosted on Google servers rather than cramming files into their own computers. Sharing can be accessed by any employees through a standard browser. This video service would work for internal training or corporate announcements, and it keeps the e-mail server from overloading during distribution.

Here are some suggestions for making use of this wealth of information about Google that you can access using Google itself.

First, understand that Google has technology in its DNA. As a result, running a query for an obvious concept such as "search technology" won’t return useful information. The trick I use is to learn the Google lingo. Let me give you two examples.

If you want to understand how Google improves the performance of its search system, you have to learn Google jargon. To illustrate: Assume you want to know how Google prevents contention over database access. If you search "row locking," you won’t find the answer without a great deal of searching. Try searching Google.com with the query "chubby lock" and look at the results. You will see that the secret word is "chubby"—Google’s name for its breakthrough technology.

In another example, say you want to learn about Google’s automatic spawning of smart agents to resolve problems in content parsing. If you search Google for artificial intelligence, you will spend hours sifting through results. Instead, try searching for "janitors agents" and look at the results. Google is a math club with mathematicians’ sense of humor and its hottest technology has mundane names like "janitor." Ho ho snort snort.

The trick then is to learn the vocabulary to unlock Google’s secrets.

Second, Google provides a window to its future applications and services in its technical papers. Navigate to the aforementioned Google Labs page here (http://research.google.com/pubs/papers.html) and scan the bold face headings such as Distributed Systems and Parallel Computing. Note that each heading has a paper count. The 85 after Distributed Systems tells you that there are 85 papers available. To identify what’s hot, I keep a list of those headings and periodically visit the site to update the paper counts. When you get a big jump—such as in artificial intelligence and data mining a year ago, I know that the topic is getting some attention at Google. Conversely, when a topic’s paper count doesn’t change or decreases, I surmise that Google is not beavering away in this field.

Third, read what Google itself says about what it is doing. You can find this information in Google News. Here’s how to do it:

  • Navigate to Google News and enter a query such as "Eric Schmidt" or substitute your favorite Googler’s name.
  • Run the query.
  • Ignore the results and scroll to the bottom of the news results page. You will see this option: Click on the option to create an e-mail alert for Eric Schmidt. By doing so, you will be able to receive each day a list of articles in which top Googler Eric Schmidt is mentioned, quoted or discussed. I use one alert per Googler I want to monitor.

The Google Alerts (google.com/alerts) service is free and eliminates the need to pay attention to 4,500 individual news sources. I run queries on country-specific Google news services when a Googler is outside the United States. On my Web log Beyond Search (http://arnoldit.com/wordpress), I broke a story about Google’s research lab activity in Israel, which was reported to me by Google itself. I used Google Translate (http://translate.google.com/translate_t#) to make the Hebrew story easier for me to figure out.

To wrap up, Google tries to be secretive, but it spouts information itself like Old Faithful. Last year, I suggested to Google that it pay me to identify these and other sources of useful, strategic information and close them up to preserve what could be proprietary information. Google, true to form, ignored my suggestion. So you can take advantage of these tips.

However, I still have a couple of them up my sleeve. What was the name of Google’s top researcher in next-generation data management technology before he changed it in year 2000?

I’m not telling because that would tip you to one of Google’s most sensitive and important research efforts in the company history. Hint: You can find the answer by searching Google.com.

Back to Contents...

Searching for answers naturally

The Provincial Agency for Health Services (Azienda Provinciale per i Servizi Sanitari) in Trento, Italy, has implemented COGITO Answers from Expert System to make its Web site more user-friendly and interactive.

Adriano Passerini, director of the agency’s customer relation service, explains, "The need to implement an information search service on the portal of the APSS came from observing the difficulties users experienced when connecting to the home page. The selected solution, with its ability to query in natural language, seems to be suiting our needs well. The searches are easier and produce accurate answers. Moreover, we can rely on the structuring of a permanent system to improve the search, based on a periodic evaluation of the questions posed to the system that don’t result in clear answers."

Emanuele Seu, the IT director who manages the information portal on the agency’s Web site, says, "The system has helped us better connect with users. With its automated semantic understanding, it makes interaction easy and stress-free—and guarantees the best quality of answers."

Expert System reports that one month after deploying its solution, the Provincial Agency for Health Services saw the following benefits:

  • 80 percent increase in online queries sent through the search engine, proving the system is being used;
  • 70 percent increase in time spent on Web pages sent by the solution, proving patients are finding the information provided by the system is valuable and worth reading; and
  • 46 percent increase in user visits that start and end on the same page, proving patients are finding exactly what they need in one click.

Back to Contents...

Intriguing integration

A strategic alliance has be announced between ConceptSearching, which is known for its concept-based search, automatic classification, semantic metadata generation and taxonomy management software, and WAND, which develops structured multilingual vocabularies spanning a wide range of industries. The partnership will provide clients with existing industry-specific taxonomies developed by WAND and integrated into Concept Searching’s technologies.

The companies report clients will be able to access all the functionality of Concept Searching’s technology, including compound term metadata generation, automatic classification and the Concept Searching taxonomy tool. The WAND pre-defined taxonomies are flexible and can be easily customized and modified to address unique organizational requirements. They will also be available through an extensive partner channel and available with conceptClassifier for SharePoint, Concept Searching’s fully integrated classification solution for the Microsoft Platform.

Back to Contents...

User-based relevance

Recommind has released MindServer Search 6.0, the latest version of its enterprise search product. Built on Recommind’s CORE (Context Optimized Relevancy Engine) platform, Version 6.0 is said to significantly extend the solution’s functionality, reach and accuracy by adding features such as enhanced relevancy tuning and by extending the system’s federated search framework.

Recommind reports its CORE platform is a fully automated information management layer that seamlessly and securely integrates structured and unstructured data inside and outside of enterprise networks. The new version is further said to allow enterprises to boost certain search results based on select properties of a document, including: freshness, rank, specific metadata and document length.

Search results can feature "Best Bets," which are pre-selected files linked to particular queries, says Recommind. Also available is a "Sponsored Links" feature, which brings external or indexed documents to the user’s attention through specific queries, but places them outside the standard search results area. In addition, users can choose to boost results based on their individual profile or that of their team.

MindServer Search 6.0’s federated search capabilities further enable users to search across internal and external data sources with one query. The extended federated search framework in MindServer Search increases user productivity by integrating internal and external results in the same result set and highlighting search terms in external sources, according to Recommind.

Back to Contents...

Exalead Announces CloudView OEM Edition 5.0

Exalead, a provider of information access software, announced the availability of CloudView OEM Edition 5.0, a search and information application development platform designed for ISVs and SaaS providers. Exalead CloudView OEM Edition 5.0 offers semantic technologies that automatically analyze, categorize, enhance, and align structured and unstructured data. The natural language processing modules (lemmatization, dynamic categorization, spell-checking, spell suggestions, etc.) are designed to evolve automatically and in real-time. CloudView OEM Edition 5.0 is an SOA and WOA-compliant development platform that has been designed to be embedded into other software offerings. Its modular architecture allows product architects to selectively include only those components that are required for their functional needs. Exalead CloudView OEM Edition 5.0 is currently tested to support 500+ terabytes of data per server cluster and 100 million documents on a single server. It is capable of indexing 30 million database objects in 10 hours and 60 gigabytes of mail data in a single hour; and can also index with speeds of up to 9000 documents per second.
 
(www.exalead.com)

Back to Contents...

Mark Logic Releases MarkMail 2.0

Mark Logic Corporation announced the availability of a new version of MarkMail, a free service for searching mailing list archives.  Powered by MarkLogic Server, MarkMail 2.0 introduces several new capabilities for enhanced user customization, personalization, and notification.  The new version of MarkMail enables users to register for personalized views of search results, secure access to private mailing list archives, and create Really Simple Syndication (RSS) feeds in order to be alerted to new content sources. New Features of MarkMail 2.0 include: an individual username and password for full access to the new MarkMail 2.0 capabilities once registered; personalized display settings; the ability to define a group of messages interrelated by topic, date range, sender, or recipient; personalized RSS Feeds; and multi-tenant privacy.

(www.marklogic.com)

Back to Contents...

BP Logix Announces Workflow Director 2.0

BP Logix, a provider of web-based business process automation and reporting solutions, announced the release of BP Logix Workflow Director. The product is scheduled to be generally available on March 2, 2009. BP Logix Workflow Director is built on an integrated document management and workflow automation platform that enables business users to manage, automate, and report on their organization’s business processes. It provides storage, categorization, and search technologies for all documents, eForms and workflow processes. The new release incorporates the following product components: BPM/Workflow automation, project management/business rules engine, eForms processing, and reporting/activity monitoring.

(www.bplogix.com)

Back to Contents...

MadCap Software Launches New Version of Feedback Server

MadCap Software, a provider of multi-channel content authoring solutions and a showcase company for Microsoft Visual Studio 2005 and Microsoft XPS, announced that the MadCap Feedback Server 2.0 is now available. The MadCap Feedback Server is a server-based feedback system for content development teams that provides Web 2.0 features to enable collaboration among team members and the readers of their documentation. MadCap Feedback Server version 2.0 adds several new features including customizable user profiles, email notifications, context-sensitive help (CSH) ID tracking, performance and scalability enhancements. With version 2.0, authors can now reset the topic view, topic rating, and search statistics when they want in order to view only the new statistics. There are also two filtering options offered--one for filtering out irrelevant IP addresses and another for filtering topic statistics by dates. The Feedback Server allows readers to rate published information using a standard five-star system. Readers’ topic ratings are tabulated, providing an overall community average score that all readers can then see on a page-by-page basis.
 
(www.madcapsoftware.com)

Back to Contents...

Higher visibility for manufacturing

Endeca and Kalypso, a professional services firm focused on helping clients innovate, have announced a strategic partnership to deliver information visibility to manufacturers. The companies reportedly will work together to bolster engineering, product design and product lifecycle management (PLM)-related initiatives for clients by providing technology to locate and manage information across an organization’s global platform.

The partnership is said to enable Kalypso to continue to bring industry-leading product innovation and PLM expertise to the world’s top manufacturing organizations, and represents the latest in a series of organizational investments at Endeca to expand its presence in manufacturing.

Extreme market conditions, increasing consolidation and rapid shifts in consumer demand have placed additional risks and upfront costs on product innovation and new product development. These challenges are compounded by the volume of data that engineers have at their disposal to inform and influence new product design decisions.

Advanced information visibility solutions, such as those offered by Endeca, provide cross-functional groups with new, contextual information, as well as historical data that can reduce research within multiple systems and promote innovation by bringing forth new development possibilities. These solutions introduce opportunities for manufacturers to increase margins, lower costs, reduce downstream supply chain risks and create new competitive advantages from their existing information investments.

Back to Contents...

Managing multiple worldwide sites

Open Text has launched the newest release of its Web Solutions, which the company says expands the ability of customers to translate, manage and synchronize multiple Web sites worldwide.

The enhanced functionality in the new version is part of a complete solution called Open Text Web Solutions for Multi-Sites that gives customers everything they need to develop a global Web presence, according to Open Text. It helps large enterprises to more efficiently share corporate information, leverage existing translation memory tools such as Across Systems or SDL TRADOS, and personalize the user experience. For small and midsize businesses, it provides a simple and cost-effective tool for expanding globally.

Open Text reports Web Solutions for Multi-Sites adds new content distribution services that make it easy for Web localization teams to share and update content via a centralized content pool. Team members are automatically notified when content changes or updates become available. Once localized content is added to this pool, all countries that use the same language can use the content for their local Web sites. A new Content Translation Management module offers translation workflow integration, expanded support for computer-aided translation tools, and export to standardized XML Localization Interchange File Format (XLIFF).

Back to Contents...

Milabra Secures $1.4 M, Announces Image and Video Recognition Engine

Milabra announced a $1.4 million financing round from private investors. Private investors included: Murphy Endeavors, an investment firm in Red Bank, N.J., Nexus Holdings Group, and Dr. Richard Turner, a visiting scientist at Carnegie Mellon University and former fellow with the Systems and Software Consortium. The company also launched a web-based image and video-recognition engine for online media companies. Milabra’s engine automatically "reads" actual images, not image file names or tags. Its technology also adds text labels to images. Milabra’s engine can learn to recognize any class of images, works with different kinds of image content including photos and video, and is delivered as a suite of web services. Milabra’s image recognition engine is the platform for a suite of web services that customers access with one standard integration exercise. Over 14 services include image ad-tagging, copyright protection, duplicate prevention, and adult content filtering.

(www.milabra.com)

Back to Contents...

Open Text Extends eDiscovery Offering

Open Text, a global provider of Enterprise Content Management (ECM) software, announced that it has extended its new eDiscovery solution to its Open Text eDOCS customers. Open Text eDiscovery Early Case Assessment is being offered through a strategic relationship with Recommind, a provider of search-powered information risk management (IRM). The solution combines Recommind’s Insite Legal Hold application with Open Text ECM Suite. Open Text eDOCS customers will gain expanded eDiscovery capabilities woven into their overall content, records, and email management practices.

Open Text also announced the availability of the latest release of Open Text Web Solutions. This new release offers a number of additional enhancements including helping customers translate, manage, and synchronize multiple websites worldwide.
The enhanced functionality is part of a complete solution called Open Text Web Solutions for Multi-Sites. It helps large enterprises to share corporate information, leverage existing translation memory tools such as Across Systems or SDL TRADOS, and to personalize the user experience. Web Solutions for Multi-Sites added new content distribution services that allows teams to share and update content via a centralized content pool. Team members are automatically notified when content changes or updates become available. Once localized content is added to this pool all countries that use the same language can use this content for their local websites. A new Content Translation Management module offers translation workflow integration, expanded support for computer-aided translation tools, and export to standardized XLIFF formats.

(www.opentext.com)

Back to Contents...

Amazon and Google Add E-Books to Phones

Google and Amazon announced they are making more e-books available on mobile phones. Google announced on its Book Search blog that the 1.5 million public domain books it had scanned can be accessed for free on PCs, were now available on cell phones, including the iPhone and T-Mobile G1. According to reports, Amazon is working on making the titles currently available on its e-book reader, the Kindle, accessible on phones.

(www.amazon.com, www.google.com)

Back to Contents...
 
[Newsletters] [Home]

Problems with this site? Please contact the webmaster. | About ITI | Privacy Policy