EnterpriseSearchCenter.com Home
  News   Features   White Papers   Research Reports   Web Events   Conferences  
 
RESOURCES FOR EVALUATING ENTERPRISE SEARCH TECHNOLOGIES
October 12, 2011

Table of Contents

Image recognition: A job for smart software or an average human
KMWorld 2011 Conference, Nov. 1 to 3,
Washington, D.C.:
KM trends, practices & conversations
Improved legal hold from StoredIQ
Managing digital assets in SharePoint
Deeper inspection of unstructured content
Janya's Semantex Version 5.0
Enterprise Search Summit Expands to Europe

Image recognition: A job for smart software or an average human

Can an enterprise use software to figure out what a digital image or a video is "about"? (About, as I am using the word, means looking at a snapshot of a farm and recognizing the pigs, the cows and the chickens.)

Visualize your office building monitored by surveillance cameras. Instead of a human security guard watching for an intrusion, software "watches" the digital video and makes a decision about a specific individual attempting to enter the building. The image recognition system plucks a person's face from the real-time video stream, matches it to a database and determines whether he or she is a vice president or a stranger without access permission. The system "recognizes" the executive and unlocks the door.

For many years, security professionals have funded, tested and tweaked commercial systems to make image recognition of faces a reliable reality, not a science fiction fantasy. Alas, software sufficiently "smart" to figure out the identity of an individual or to determine the "aboutness" of a digital image is a pot of gold long sought after but not yet found.

Google and celebrity facial recognition

But advances are being made. In May 2011, Google's patent "Automatically Mining Person Models of Celebrities for Visual Search Applications" set off a flurry of commentary on blogs and mainstream publications like Forbes. Patent US20110116690 was being downloaded when Google's chairman, Eric Schmidt, was explaining that image recognition was "too creepy." (See "Facial Recognition: Google Chairman Warns US Govt", May 20, 2011, at http://goo.gl/DPOuj.)

When I want some insight into next-generation search technology, I navigate to Google Research's Publications by Googlers at http://research.google.com/pubs/papers.html. Although not a comprehensive archive, the technical papers provide a useful glimpse into search technologies from some of the world's most sophisticated engineers and scientists. In the category Audio, Video and Image Processing, there were more than 100 technical papers, last I checked.

One research report suggested that Google's experts were testing a taxonomy with more than 1,000 categories. The idea was to use "smart software" to figure out what a video is "about." To me, the Google method echoes Autonomy's (autonomy.com) approach, and demonstrated that Google algorithms can categorize video without metadata at an acceptable level of accuracy.

A 2009 article indicated that Google is working to figure out the "what" in imagery. And yet another report suggested that Google has powerful image functionality that remains, for now, on the sidelines. Is that due to a decision dictated by financial, legal or technical factors? There is scant information about Google's plans for its image recognition technology. What is clear is that Google has invested time and effort in figuring out the content of static images and digital video. When Google does move, the impact on the market could be significant due to its near monopolistic control of search and retrieval.

Current examples of what's available

My view is that Google's consumer image search is useful, probably as good as, if not better than, comparable systems from Bing, TinEye and Flickr.

I prefer the image search function of Exalead (a division of Dassault Systemes), which returns relevant images without the malware attracting iFrames used by Google. What few of my colleagues in the field of enterprise search know is that Exalead's system has for several years offered image search features only now becoming available on Google. For example, it automatically recognizes an image suitable for desktop wallpaper and displays a hot link to it. Exalead's portrait or landscape option has been available for a long time, and the company has also pushed ahead in video search.

Autonomy also offers image and video search systems. Other vendors include such companies as OpenText's Nstein unit, which uses technology from Imprezzeo. Nstein employs content-based image retrieval and facial recognition. Its system has been tailored to the needs of those engaged in publishing. The user inputs or identifies a sample image. The system then displays matches. With some clicking, the result set can be narrowed to the image the user requires. Nstein provides a software development kit for the system.

A firm called IQ Engines offers an image recognition system that performs "computer vision search." You upload an image to the system. After a minute of processing, the system either displays matches or reports that the image was not in the database.

Kooaba is a visual recognition startup. The company offers a photo management system for licensees and an iPhone application. The user takes a picture of an object and uploads it to Kooaba. The system then "finds" similar images.

A key point is that these systems are using metadata like the date, time, file type and user generated description of an image. Algorithms create a "fingerprint" for color, shapes and other discernable characteristics. If an image appears in a PowerPoint, the name of the PowerPoint "author" may be attached to the digital object. These systems are not figuring out whether the image is a prize-winning heifer or a Volkswagen Jetta.

Image recognition applications

Confusion about image search, image recognition and image systems is flourishing. One reason is the failure to distinguish between the different applications to which image recognition can be applied.

Certain types of image processing work well, are well understood and have a measureable impact. A good example is the machine vision sector of image recognition. Cognex is one of the leaders in machine vision. The company's products make it possible to process barcodes for inventory control. Its technology can "look at" a stream of manufactured components and "see" those with defects. You may want to check out Orpix Computer Vision, Pattern Recognition Company and Microscan, among others.

Cognex, despite the soft economy, reported record revenue in its first quarter of 2011. The firm seems likely to push beyond $300 million in revenues. One indication of the strength of this company is its cash position. The firm had a war chest in May 2011 or more than $300 million in cash and investment. At a time when traditional enterprise search vendors are struggling to stay afloat or tap investors for additional cash, Cognex is flying high.

There are some important differences between the image recognition needs in markets served by Cognex and the needs for image recognition on the part of marketing, sales and business development people. A Cognex machine vision solution can be focused on a well-defined domain, often with specific attributes or "tells." A defective chip, for example, may emit a different refractive index or have a discernible color variation. The technology to recognize a defect in a production line setting is extremely sophisticated. The return on investment can be calculated. Even at competitive labor rates, machine vision can pay for itself with speed, accuracy and at a lower cost than manual methods.

In marketing and sales, however, the person putting together a slide presentation needs an image of a product (relatively easy to find if there is metadata attached to the available pictures), or an image to show an intangible quality such as vigor (relatively hard even if someone has indexed an in-house image collection). Vendors offering image management systems based on metadata provided by the camera or by a human indexer are available. One can use the InMagic (inmagic.com) system as an image retrieval system. Clever system administrators can make a traditional database like Oracle (oracle.com) or SQL Server (microsoft.com/sqlserver) provide access to images.

But for larger collections of digital images-what used to be called 35-mm slide collections-one needs specialized digital asset management (DAM) systems from such vendors as Adobe, Canto or Microsoft iView, among others. Those systems offer version management, support for different image types such as Adobe Photoshop and PDF, TIFF and vector drawing files. The systems include access controls, essential if an organization is doing work for certain government agencies. They focus on reducing bottlenecks in workflows.

Even with fancy systems, the amount of time required to find a specific image or a specific segment of digital video is indeterminate. Exalead's video search system does allow the user to view a video at the point at which the query matches the content of a digital video.

And what about video?

Video can pose some additional challenges. Digital video is an unwieldy beast with an appetite for storage and a generous side dish of bandwidth. One company that has received accolades from industry groups and analysts is Altus, whose flagship product is vSearch. The company offers on-demand rich media solutions for a range of enterprise applications. The system can be used for knowledge sharing within an organization, a sales enabling service, an educational service or a system to deliver video from a conference with multiple, simultaneous presentations.

Altus has positioned itself as providing a service that "transforms enterprise video into a valuable asset for any organization. vSearch creates a cloud-based learning environment that combines enterprise video with PowerPoint slide synchronization and scrolling transcripts into an accessible video content archive that is searchable down to the spoken word or specific point of interest. Content can be viewed as streaming media or on-demand presentations from any computer, tablet or smart phone-allowing instant access to knowledge anytime or anywhere." The Altus approach is to deliver video search as software as a service (SaaS).

Still, the question that interests me is, "Are these systems from sophisticated technology companies able to look at an image or a frame in the video and ‘figure out' what the picture represents?" The sci-fi version of image recognition is out of reach. The meaning of a picture depends on a context that, at this time, requires a human to discern. For now, humans still have a role to play in finding just the right image for any given situation. We are not about to see the end of that good old-fashioned function called indexing for rich media for a few years. 

Back to Contents...

KMWorld 2011 Conference, Nov. 1 to 3,
Washington, D.C.:
KM trends, practices & conversations

The annual KMWorld Conference encompasses all the essential pieces of information that power today's effective enterprise—including knowledge creation, publishing, sharing, finding, mining, reusing and more. It also features the following co-located events:

Riffing off the conference theme, "Networked Enterprises: Empowered to Share & Apply Knowledge," leading Web strategist and social media guru Jeremiah Owyang of the Altimeter Group (altimetergroup.com) opens KMWorld 2011 on Nov. 1 at the Washington Marriott Wardman Park in Washington, D.C., with a look at collaborative enterprises of the future. Using key new research from Altimeter, Owyang talks about social business readiness and how most advanced organizations have established baseline governance, adopted enterprisewide response processes, developed ongoing education programs, and use best practice sharing and leading social media through a dedicated, shared central hub.

Last year, Lynda Braksiek, manager of knowledge and critical skills management for Rockwell Collins, took a team of KM and IT individuals to KMWorld 2010. She says, "I had not been to a KMWorld conference in five years, so I was first impressed with the diversity of the sessions. While I chose the KM tracks typically, having the emphasis on search and SharePoint gave our IT folks an opportunity and the desire to attend the conference as well. However, to my surprise, they chose not to attend all technology tracks once they arrived. They were intrigued with the KM, soft skill focus and behavior-related sessions and discussions in the KM track.

"One of our experts said he finally saw the connection of KM practices and principles to the technologies we implement at our company. He said that it really is nice to understand the purpose behind our KM initiative even more. So, while I was grateful to see him reach this ‘ah-ha' moment, I also learned we still had some work to do at our company with educating our employees on the value of knowledge management. Lesson learned!"

Focus on people

Braksiek continues, "Overall, the greatest experience was the networking with our KM and IT teams in the evening as we discussed the implications of new trends in KM and technology to our company. In addition, I found a lot of the sessions on KM practices and social media extremely valuable. In particular, I appreciated the opinion of experts that we need to refocus our KM programs on people, not technology. While technology is a key enabler, people are at the heart of all our KM programs."

The 2011 KMWorld Conference Nov. 1 to 3 is geared to practitioners, innovators, and experienced and novice knowledge management professionals.  

Back to Contents...

Improved legal hold from StoredIQ

StoredIQ has announced an enhanced release of StoredIQ Legal Hold, a solution for managing the entire legal hold process. From notification, to tracking acknowledgement, to analyzing custodial data, and finally collection and preservation, legal and IT users can gain complete control and insight into the duty to preserve process with a reliable, repeatable and auditable solution that seamlessly integrates hold notifications with the collection and preservation of data.

Fully integrated with DiscoveryIQ, StoredIQ's e-discovery application, companies can ensure compliance with case law, the company claims. Organizations can initiate hold notifications, track acknowledgements, perform early case assessment across all matter relevant data, and perform single-instance collection to a secure retention platform.

Back to Contents...

Managing digital assets in SharePoint

Well-respected SharePoint content lifecycle management provider Metalogix has formed a partnership with Equilibrium, which offers on-demand solutions for content visualization and digital asset management.

The companies say the strategic alliance will blend the synergies between Metalogix StorePoint and Equilibrium MediaRich to benefit mutual customers with an enhanced experience when searching and inspecting SharePoint document libraries, even when content has been offloaded from the SharePoint database to improve system performance.

Trevor Hellebuyck, Metalogix VP of enterprise technology, says, “The partnership between Equilibrium and Metalogix helps resolve the fundamental problem of storing digital assets in SharePoint. When used together, MediaRich and StoragePoint enable SharePoint users and administrators to not only instantly visually search and view all digital content, but also have the unstructured content BLOBs [binary large objects] offloaded so that SharePoint databases can perform at optimum levels.”

Back to Contents...

Deeper inspection of unstructured content

Proofpoint has chosen technology from ISYS Search Software to enhance its cloud-based e-mail security, e-discovery and compliance solutions. The company will use ISYS Document Filters for text extraction of unstructured data.

Wade Chambers, executive VP of engineering at Proofpoint, says, “One of the keys to delivering Proofpoint’s best-of-breed e-mail security and archiving solutions is working with partners that are invested in our success. In addition to its technical prowess, ISYS also brings a level of service not matched by competitors.”

According to press release from ISYS, key capabilities of ISYS Document Filters include:

  • enterprise format support—more than 400 common and legacy file and e-mail formats supported, plus archives and containers such as ZIPs and MSGs;
  • deep inspection of content—mine previously hidden content, including tracked changes, comments, notes, annotations and embedded Web links;
  • comprehensive platform coverage—Window, Linux, Mac OS, Solaris, HP-UX and AIX; and
  • full support—high-definition rendering of text and coverage of all character sets and encodings, including Unicode.

Back to Contents...

Janya's Semantex Version 5.0

Janya has launched Version 5.0 of its natural language processing and unstructured data mining platform. Key new features include improved language support, broader support for office document formats, new output formats, optimized preconfigured levels of processing and highly scalable service-oriented architecture support for big data deployments.

The company explains the semantic analysis platform has historically powered enterprise, SaaS, social media analysis and government intelligence applications. The new enhancements are said to add value for existing customers and provide additional capabilities enabling a range of government, commercial and academic uses. Semantex 5.0 can power solutions including market research, competitive intelligence, scientific, patent and medical data mining, e-discovery and compliance monitoring, says Janya.

New features in Semantex 5.0 include:

  • improved multilingual support,
  • easier customization and semantic web support,
  • increased scalability for SOA architectures and big data deployments,
  • Janya Document Filters for office document format support,
  • new output formats, and
  • performance enhancements.

Back to Contents...

Enterprise Search Summit Expands to Europe

Join us in London this October for two days of plenary and panel sessions, technical and implementation tracks, and case studies from corporate, public sector and not-for-profit organisations, supported by a range of networking opportunities to promote debate and dialogue and help you to learn from your peers.

Topics covered include:

  • multilingual search
  • open source search applications
  • federated search
  • search centres of excellence
  • search business case development
  • mobile search
  • SharePoint search
  • technology trends
  • enterprise search analytics
  • search based applications
Enterprise Search Europe, 24-25 October 2011, Hilton London Olympia.

Back to Contents...
 
[Newsletters] [Home]

Problems with this site? Please contact the webmaster. | About ITI | Privacy Policy