Enterprise Search Center

RESOURCES FOR EVALUATING ENTERPRISE SEARCH TECHNOLOGIES

September 05, 2007

Table of Contents

Roundtable Discussion: E-Discovery

Documentum 6 unveiled

Asbru CMS Adds further Search Engine Optimization

EOS.Web Launches RSS Capability

Search Engine Exalead Joins ACAP Pilot Project

One-stop search solution for law firm

Semantra Secures Second Round of Funding

Teragram Launches MyGADs Enterprise Edition

Silver Creek Systems and Endeca Unveil Datalens Foundry for Endeca IAP

New tool for litigatorsCase

Accelerate e-discovery

Roundtable Discussion: E-Discovery

KMWorld recently hosted a roundtable discussion that focused on e-discovery. Led by KMWorld senior writer Judith Lamont, the roundtable included David Cooper, principal with Marin IT, a network integrator and consulting firm; Johannes Scholtes, CEO of ZyLAB; and Barry Murphy, principal at Forrester Research.

Q Lamont: Barry, can you start us off with a brief definition of e-discovery and explain why it is such a hot issue right now?

A Murphy: The best way to think of e-discovery is the process of collecting, preserving, reviewing and producing electronically stored information in response to a regulatory or legal investigation. It’s gained a lot of intensity since last December, when amendments to the Federal Rules of Civil Procedure took effect. Now, the courts are essentially saying that organizations need to treat all types of information—including e-mail, documents and structured data—as corporate records. Companies need to know where it all is and get to it in a cost-effective way. The heat is on for organizations to start managing their information proactively, but they are just not prepared to do this.

Q Lamont: Why are organizations so unprepared for e-discovery?

A Murphy: First, you’ve got huge volumes of information, and second, most of it is in unmanaged repositories. Only about five percent of information is in managed repositories where the users proactively place it there, tag it with metadata and put it into some kind of classification or taxonomy. The rest of it is "out there" in e-mail, network file servers, desktops and removable media. So the fact that organizations have let information get away from them has created a dire situation where they’re simply not ready, and they’re not going to be ready, in the next year or two, to get their arms around this.

Q Lamont: What do they need to do in order to be more prepared?

A Murphy: Realistically, organizations are going to need a three to five year strategy, and they also need to understand that there is not a silver bullet to fix the situation. They need to develop processes and a supporting technology system. Organizations on the receiving side of this information—whether an agency, private company or the litigation support companies working with them—face similar problems in that they need to organize, search, review and understand the materials. Usually the volume of information is very large, and considerable scrutiny is required to identify key details of interest.

Q Lamont: David, your organization is involved in e-discovery and uses ZyLAB’s eDiscovery Suite. What is it that you do for your clients?

A Cooper: Several years ago, I was approached to put together a system to support a major antitrust litigation case focused on allegations of price fixing. Our client is the plaintiff, and we are receiving large volumes of information—millions of pages—from the defendant. About 54 law firms across the United States are involved in the case. Our client needed a centrally hosted system that allowed everyone to search, tag, retrieve and review the data. We thought ZyLAB was the perfect way to do that. For one thing, we get data in many different formats, including Word documents, PDF files and paper, which ZyLAB can easily handle. And we use ZyLAB’s Web interface, so we don’t have to maintain client software for the users.

Q Lamont: Jan, what prompted ZyLAB to become involved in e-discovery?

A Scholtes: We’ve always been very much involved in what’s now called e-discovery or e-disclosure. Our first customers, back in 1983, were actually lawyers. The initial development of the protocols in our products was heavily funded by the FBI and several law firms. These organizations had a lot of data and a lot of different file formats, and needed a way to organize and search their information. Over the years, we have added records management functionality to do automatic retention and to deploy filing plans, but the searching and gathering of unstructured information, whether it’s paper, e-mail or electronic files from hard disks, is where we have our roots. Once everything is searchable, it’s so much easier to organize the data and then prepare it for e-discovery and legal production. So we cover both sides of e-discovery: the organizations that are responding to requests, and those that need to ingest, organize and interpret the "discovered" content they receive.

Q Lamont: What is the typical scenario for ZyLAB?

A Scholtes: A lot of our projects are typically what Barry described, where an organization is unprepared and facing a court order, or there is an internal investigation. Suddenly they have several hundred gigabytes of data that must be analyzed and searched, and shared with other parties. We have helped a lot of organizations, both corporations and public organizations, handle these very complex cases. And as David described, we also help process discovery information that is received by an organization, which often has to happen in a short time span.

Q Lamont: If an organization has had to scramble to produce information, what happens after the crisis is over?

A Scholtes: What we’ve seen is that if a company or organization has been through a number of these incidents, then typically the departments then start wanting a long-term solution. It might be the corporate legal department or the director of litigation support, or if you’re talking about the SEC or the FBI then it’s the investigators. They start using ZyLAB technology as a standard tool for investigation and analysis, gathering of information, production, disclosure and so forth.

Q Lamont: Barry, what do you think a company should do to get on track?

A Murphy: I recommend starting with the worst pain point. If it’s e-mail, investigate the archiving tools. Or if it’s the file system, look at indexing tools to help quickly find and produce materials for e-discovery. Later, organizations can get out in front of the curve with a longer-term strategy, and also get their records management to include types of content beyond their repository of official records. Of course, having the right policies and procedures in place is important. To do this, companies need to formalize the relationship
between IT and legal, so that the policies set up for records management are ones that the lawyers feel comfortable arguing in court.

Q.Lamont: David, when Marin IT deployed ZyLAB, how did things go? Was it a smooth process?

A Cooper: The ZyLAB part worked very well. We received a lot of the discovery information in hard copy and needed to scan it. The OCR engine worked well, that was all great. The most challenging thing was getting the defendants, the people who were bringing the documents to us, to produce the document with accurate load files. The load file should indicate where the document begins and ends. We needed to work with the people who brought us the documents in order to educate them about how to produce the information.

Q Lamont: Lawyers are not always known for their early adoption and their love of technology. What was the user response?

A Cooper: That’s all over the map. We’ve got some people who absolutely love it and use it constantly. Some people have clearly been practicing law a lot longer and are more set in their ways, and they’ll ask their admin assistants to perform the searches. They want some of it produced in paper so they can sit at their desk and read, because it’s the way they’ve always done it and they are most comfortable with paper.

Q Lamont: Did the client for this case express any concerns about having a lot of sensitive information hosted outside the organization?

A Cooper: It was definitely a topic of conversation. We have very strong physical security, with locked cabinets, closed-circuit television and biometric security. Then, of course, we also have IT security, with password protection and Windows-based security and https certificates. But it was interesting to see that the majority of the users were more concerned with physical security.

Q Lamont: How important is it for the data to be available in native format?

A Scholtes: Although the lawyers initially want to start working with native file formats, they want to go to a bit-mapped format as soon as data has been produced for third parties. They know there is a lot of hidden information in those native file formats, like comments and tracked changes. Producing in TIFF and unsearchable PDF strips out this information. It’s also easier to redact in these formats and know the redacted information can never be read.

A Cooper: We do keep the original file in its native format, whether it is a TIFF file, Word, Excel or an e-mail PST file. But it’s true, the lawyers don’t request the native file format very often. Most of them are working with the system, asking for export of certain items, tagging and categorizing. If the data is not bit-mapped, ZyLAB stores it in XML format, and that’s what the users are tagging.

Q Lamont: Why did ZyLAB opt to store the data in XML?

A Scholtes: We were first confronted with having to search very large collections of e-mail back in 1997. There are many different e-mail formats, and messages often arrive with attachments. Our users want 100 percent recall, including messages embedded within another message, and the associated attachments. XML can do this, plus it’s very endurable and sustainable, and unlike many of the e-mail formats, you will be able to access XML files in 10 or 20 years. Using XML is part of our philosophy of having an open architecture.

A Cooper: The open architecture was one of the reasons, an important reason, why we opted to use ZyLAB. Having standards like XML that we could work with, we were able to get all the content into a common format.

Q Lamont: What are some of the broader effects of the open architecture?

A Scholtes: Lawyers often want to work with specialized forensic software products or court presentation tools, and they need to integrate these products with their e-discovery system. For example, InData’s Trial Director and CaseSoft’s CaseMap are popular court presentation tools that people often want to integrate, and ZyLAB can do that.

Q Lamont: Barry, considering how many software products there are for different aspects of e-discovery, do you see consolidation in the future and more end-to-end solutions?

A Murphy: I think there are two moves toward the end-to-end solution. One is on the software side—the software vendors are gaining capabilities, either through acquisition or building out their products, to address the full e-discovery process all the way from collection through review. The other is through the services side. At heart, e-discovery is a process that is ripe for outsourcing. Companies will create models whereby they can host not only the data but also the platforms to create an end-to-end solution, even if it’s not available from a single vendor.

Back to Contents...

Documentum 6 unveiled

EMC has revealed the Documentum 6 enterprise content management (ECM) platform, which, the company says, enables rapid and flexible development, configuration and deployment of next-generation enterprise content applications.

EMC says that key to the Documentum 6 platform is its new services-based API (application programming interface), as well as new development tools that together revolutionize Documentum-based application development and configuration. EMC also announces its $100,000 Developers' Challenge, which encourages development of new applications based on the Documentum 6 platform.

EMC highlights the following new capabilities of the Documentum 6 platform:

Documentum Enterprise Content Services--a new, Web services-based API that simplifies development and integration with ready-to-use enterprise content services for easy integration with other enterprise applications within a service-oriented architecture (SOA).

Documentum Composer--provides a standards-based platform for development and configuration tools that reduces the need for coding and facilitates composition of applications with reusable elements, making the design, customization, deployment and maintenance of content applications faster and easier.

Documentum Branch Office Caching Services--enables robust global scalability for enterprises with high-performance, remote application requirements by enabling all operations (read, create, edit, version, search) to occur locally to a user, regardless of location.

Back to Contents...

Asbru CMS Adds further Search Engine Optimization

Asbru Software has released a version of the Asbru Web Content Management system designed to provide control and simplicity for non-technical users while maintaining flexibility and power for developers and web designers. Version 6.5 of the Asbru Web Content Management system adds Google Sitemap functionality, a number of new add-ons, and a new user. Version 6.5 of the Asbru Web Content Management System generates what are known as Google Sitemaps. The Asbru Web Content Management Google Sitemap solution is integrated with the system so when a website administrator changes the structure of their website through the system’s drag and drop user interface the new website structure is automatically published to a sitemap XML file adhering to the Sitemap protocol supported by Google, Microsoft, and Yahoo!

Other features incorporated into Asbru Web Content Management version 6.5 include: The Multi--level Menu Generator enables non--technical users to design their own horizontal or vertical multi--level navigation menus without any knowledge of HTML or CSS; The new CSS Template Generator enables non--technical users to add CSS based templates without any knowledge of HTML or CSS; The photo gallery add-on presents photos in a lightbox slideshow; The Pluggable Calendar can be used for display of any standard type of content such as news, events and blog entries and for any type of custom data containing dates. Asbru Web Content Management 6.5 includes functionality that is designed to let web hosting companies and web developers build industry solutions that can be added by non-technical website managers.

(www.asbrusoft.com)

Back to Contents...

EOS.Web Launches RSS Capability

EOS International has announced enhancements in the EOS.Web quarterly software release. EOS.Web now enables users to receive updated content such as new/updated title lists or new/changed Reference Tracking Questions. These RSS "Web feeds" include either an outline of information from the title and links to the full text. This feature is designed for EOS.Web users to stay current with their Web sites instinctively, allowing easier access to changing Web content. Also included with the release are searching enhancements that include displayed highlighted keywords in records within one’s search inquiry. 508 Compliance has been designed to ensure content accessibility and IP Authentication.

(www.eosintl.com

Back to Contents...

Search Engine Exalead Joins ACAP Pilot Project

The Automated Content Access Protocol (ACAP) Pilot Project, established to enable owners of content to speed machine-interpretable permissions, has announced that Exalead, a provider of search software for business and the Web, has joined the initiative and will provide concept validation by November 2007.

Exalead offers search technology that adapts to user habits in search queries. Exalead has also recently expanded its vertical search offerings to allow users to search their favorite images and videos, as well as the popular Wikipedia database.

(www.exalead.com, www.the-acap.org, www.wikipedia.org)

Back to Contents...

One-stop search solution for law firm

A law firm with more than 125 lawyers at offices in Jersey and Guernsey in the Channel Islands, as well as in London, can now search across databases, file systems and Web sites via its intranet. Carey Olsen has deployed ISYS:web 8 from ISYS Search Software for its search features and its ability to connect with Interwoven WorkSite.

The new system is said to enhance Carey Olsen's ability to serve its clients, which include governments, financial institutions, corporations, professional firms, private individuals and more, by providing a common interface to search all bodies of information stored by the firm.

"We see the introduction of ISYS:web as a major step toward having a one-stop solution for accessing knowledge," says Stuart Bush, IT director of Carey Olsen. "We've used ISYS for years and think highly of the pedigree of its search technology. Its ability to connect with Interwoven makes it an even more powerful tool and is what led us to upgrade to Version 8."

According to a recent press release from ISYS, Carey Olsen uses ISYS' on-the-fly categorization feature—which automatically builds relevant categories with each results set—and connects it with Interwoven metadata. That generates meaningful categories with each search that users recognize from their work with [Interwoven] FileSite.

Back to Contents...

Semantra Secures Second Round of Funding

Semantra, an enterprise search company, has secured $3.8 million in equity funding from current investor CPMG, Inc. The investment will be used for product development and building out Semantra’s go-to-market partnerships. CPMG, Inc. invested $2.3 million in September 2006, bringing its total investment in Semantra to over $6 million. Semantra’s software solution allows users worldwide to retrieve data from their corporate databases by entering inquiries in a familiar search box, using common, everyday language. The solution combines business intelligence and enterprise search.

(www.semantra.com)

Back to Contents...

Teragram Launches MyGADs Enterprise Edition

Teragram, a provider of multilingual natural language processing technologies, has announced the launch of MyGADs Enterprise Edition, a collaborative technology that allows corporate users to work together on a searchable wiki to share information among groups and retrieve data through multiple formats. Additionally, Teragram's Direct Answers technology allows members of the group to search the stored information, returning the answer a user is seeking. This search and retrieve capability works not only from an Internet browser, but also from mobile devices such as cell phones and PDAs, as well as through common IM services. A group of users can create a GAD page using a simple Web editor within MyGADs.com, and access, edit, update, search for and share documents through Internet browsers, smartphones, mobile PDAs, J2ME-enabled devices and cell phones.

(www.teragram.com/info)

Back to Contents...

Silver Creek Systems and Endeca Unveil Datalens Foundry for Endeca IAP

Silver Creek Systems, an enterprise product data integration solutions company, and Endeca Technologies, Inc., an enterprise information access software company, have announced the immediate availability of DataLens Foundry for Endeca, a new software module for the Endeca Information Access Platform (IAP) that automatically extracts and standardizes all product data characteristics from one or multiple sources for use in information access applications. The companies also announced a new reseller agreement that will give Endeca the ability to sell the DataLens Foundry directly to new and existing customers. The DataLens Foundry is immediately available for Endeca customers and prospects.

(www.endeca.com)

Back to Contents...

New tool for litigatorsCase

Thomson West has announced Case Evaluator on Westlaw, a new research feature that returns only the desired results unique to each query. West is a business within Thomson.

Case Evaluator enables litigators to build a fact base representing the outcomes of comparable cases--including verdicts, awards, relevant medical and expert witness data, and other criteria--and then generate Westlaw reports that can help sharpen legal strategies for the matter at hand.

Thomson West further explains thtat Case Evaluator allows litigators to:

evaluate potential cases;
view verdict and settlement trends;
examine trial court motions and memoranda;
review appellate briefs, petitions and decisions;
use medical resources and illustrations; and
view analysis of experts included in verdicts and settlements, as well as excerpts of their testimony.

Back to Contents...

Accelerate e-discovery

Recommind has introduced Axcelerate eDiscovery, which is said to offer a number of industry firsts, including First-Pass Review of an entire document collection, One-Click Coding of documents and a number of sophisticated tools that significantly expedite the review process, improve attorney efficiency and lower e-discovery costs for clients, says Recommind.

The company explains that by automatically organizing even multi-terabyte document sets by myriad parameters including responsiveness, issue, privilege and concept group, Axcelerate eDiscovery's First-Pass Review makes review organization and document batching simpler, more accurate and quicker than ever before. Simultaneously, Recommind says, Axcelerate's One-Click Coding feature makes a computer-generated judgment--with explicit confidence score--about each document's relevance, responsiveness and privileged nature, significantly speeding the actual review process while concurrently improving accuracy and lowering the risk of missing key documents.

Axcelerate eDiscovery further offers law firms and enterprises more accurate and extensive document culling and filtering of virtually all document types, including both structured and unstructured data, Recommind says. The solution automatically filters duplicates and near-duplicates between and across custodians and parties, and offers contextual e-mail thread analysis, while also applying more than 17 additional filters to a document collection. The product's early case assessment tools then automatically extract key documents, people, phrases and concepts of interest, helping attorneys and paralegals quickly understand the contents of a document collection before the review process has even begun.

Back to Contents...

[Newsletters] [Home]