KMWorld recently hosted a roundtable discussion that focused on e-discovery. Led by KMWorld senior writer Judith Lamont, the roundtable included David Cooper, principal with Marin IT, a network integrator and consulting firm; Johannes Scholtes, CEO of ZyLAB; and Barry Murphy, principal at Forrester Research.
Q Lamont: Barry, can you start us off with a brief definition of e-discovery and explain why it is such a hot issue right now?
A Murphy: The best way to think of e-discovery is the process of collecting, preserving, reviewing and producing electronically stored information in response to a regulatory or legal investigation. It’s gained a lot of intensity since last December, when amendments to the Federal Rules of Civil Procedure took effect. Now, the courts are essentially saying that organizations need to treat all types of information—including e-mail, documents and structured data—as corporate records. Companies need to know where it all is and get to it in a cost-effective way. The heat is on for organizations to start managing their information proactively, but they are just not prepared to do this.
Q Lamont: Why are organizations so unprepared for e-discovery?
A Murphy: First, you’ve got huge volumes of information, and second, most of it is in unmanaged repositories. Only about five percent of information is in managed repositories where the users proactively place it there, tag it with metadata and put it into some kind of classification or taxonomy. The rest of it is "out there" in e-mail, network file servers, desktops and removable media. So the fact that organizations have let information get away from them has created a dire situation where they’re simply not ready, and they’re not going to be ready, in the next year or two, to get their arms around this.
Q Lamont: What do they need to do in order to be more prepared?
A Murphy: Realistically, organizations are going to need a three to five year strategy, and they also need to understand that there is not a silver bullet to fix the situation. They need to develop processes and a supporting technology system. Organizations on the receiving side of this information—whether an agency, private company or the litigation support companies working with them—face similar problems in that they need to organize, search, review and understand the materials. Usually the volume of information is very large, and considerable scrutiny is required to identify key details of interest.
Q Lamont: David, your organization is involved in e-discovery and uses ZyLAB’s eDiscovery Suite. What is it that you do for your clients?
A Cooper: Several years ago, I was approached to put together a system to support a major antitrust litigation case focused on allegations of price fixing. Our client is the plaintiff, and we are receiving large volumes of information—millions of pages—from the defendant. About 54 law firms across the United States are involved in the case. Our client needed a centrally hosted system that allowed everyone to search, tag, retrieve and review the data. We thought ZyLAB was the perfect way to do that. For one thing, we get data in many different formats, including Word documents, PDF files and paper, which ZyLAB can easily handle. And we use ZyLAB’s Web interface, so we don’t have to maintain client software for the users.
Q Lamont: Jan, what prompted ZyLAB to become involved in e-discovery?
A Scholtes: We’ve always been very much involved in what’s now called e-discovery or e-disclosure. Our first customers, back in 1983, were actually lawyers. The initial development of the protocols in our products was heavily funded by the FBI and several law firms. These organizations had a lot of data and a lot of different file formats, and needed a way to organize and search their information. Over the years, we have added records management functionality to do automatic retention and to deploy filing plans, but the searching and gathering of unstructured information, whether it’s paper, e-mail or electronic files from hard disks, is where we have our roots. Once everything is searchable, it’s so much easier to organize the data and then prepare it for e-discovery and legal production. So we cover both sides of e-discovery: the organizations that are responding to requests, and those that need to ingest, organize and interpret the "discovered" content they receive.
Q Lamont: What is the typical scenario for ZyLAB?
A Scholtes: A lot of our projects are typically what Barry described, where an organization is unprepared and facing a court order, or there is an internal investigation. Suddenly they have several hundred gigabytes of data that must be analyzed and searched, and shared with other parties. We have helped a lot of organizations, both corporations and public organizations, handle these very complex cases. And as David described, we also help process discovery information that is received by an organization, which often has to happen in a short time span.
Q Lamont: If an organization has had to scramble to produce information, what happens after the crisis is over?
A Scholtes: What we’ve seen is that if a company or organization has been through a number of these incidents, then typically the departments then start wanting a long-term solution. It might be the corporate legal department or the director of litigation support, or if you’re talking about the SEC or the FBI then it’s the investigators. They start using ZyLAB technology as a standard tool for investigation and analysis, gathering of information, production, disclosure and so forth.
Q Lamont: Barry, what do you think a company should do to get on track?
A Murphy: I recommend starting with the worst pain point. If it’s e-mail, investigate the archiving tools. Or if it’s the file system, look at indexing tools to help quickly find and produce materials for e-discovery. Later, organizations can get out in front of the curve with a longer-term strategy, and also get their records management to include types of content beyond their repository of official records. Of course, having the right policies and procedures in place is important. To do this, companies need to formalize the relationship
between IT and legal, so that the policies set up for records management are ones that the lawyers feel comfortable arguing in court.
Q.Lamont: David, when Marin IT deployed ZyLAB, how did things go? Was it a smooth process?
A Cooper: The ZyLAB part worked very well. We received a lot of the discovery information in hard copy and needed to scan it. The OCR engine worked well, that was all great. The most challenging thing was getting the defendants, the people who were bringing the documents to us, to produce the document with accurate load files. The load file should indicate where the document begins and ends. We needed to work with the people who brought us the documents in order to educate them about how to produce the information.
Q Lamont: Lawyers are not always known for their early adoption and their love of technology. What was the user response?
A Cooper: That’s all over the map. We’ve got some people who absolutely love it and use it constantly. Some people have clearly been practicing law a lot longer and are more set in their ways, and they’ll ask their admin assistants to perform the searches. They want some of it produced in paper so they can sit at their desk and read, because it’s the way they’ve always done it and they are most comfortable with paper.
Q Lamont: Did the client for this case express any concerns about having a lot of sensitive information hosted outside the organization?
A Cooper: It was definitely a topic of conversation. We have very strong physical security, with locked cabinets, closed-circuit television and biometric security. Then, of course, we also have IT security, with password protection and Windows-based security and https certificates. But it was interesting to see that the majority of the users were more concerned with physical security.
Q Lamont: How important is it for the data to be available in native format?
A Scholtes: Although the lawyers initially want to start working with native file formats, they want to go to a bit-mapped format as soon as data has been produced for third parties. They know there is a lot of hidden information in those native file formats, like comments and tracked changes. Producing in TIFF and unsearchable PDF strips out this information. It’s also easier to redact in these formats and know the redacted information can never be read.
A Cooper: We do keep the original file in its native format, whether it is a TIFF file, Word, Excel or an e-mail PST file. But it’s true, the lawyers don’t request the native file format very often. Most of them are working with the system, asking for export of certain items, tagging and categorizing. If the data is not bit-mapped, ZyLAB stores it in XML format, and that’s what the users are tagging.
Q Lamont: Why did ZyLAB opt to store the data in XML?
A Scholtes: We were first confronted with having to search very large collections of e-mail back in 1997. There are many different e-mail formats, and messages often arrive with attachments. Our users want 100 percent recall, including messages embedded within another message, and the associated attachments. XML can do this, plus it’s very endurable and sustainable, and unlike many of the e-mail formats, you will be able to access XML files in 10 or 20 years. Using XML is part of our philosophy of having an open architecture.
A Cooper: The open architecture was one of the reasons, an important reason, why we opted to use ZyLAB. Having standards like XML that we could work with, we were able to get all the content into a common format.
Q Lamont: What are some of the broader effects of the open architecture?
A Scholtes: Lawyers often want to work with specialized forensic software products or court presentation tools, and they need to integrate these products with their e-discovery system. For example, InData’s Trial Director and CaseSoft’s CaseMap are popular court presentation tools that people often want to integrate, and ZyLAB can do that.
Q Lamont: Barry, considering how many software products there are for different aspects of e-discovery, do you see consolidation in the future and more end-to-end solutions?
A Murphy: I think there are two moves toward the end-to-end solution. One is on the software side—the software vendors are gaining capabilities, either through acquisition or building out their products, to address the full e-discovery process all the way from collection through review. The other is through the services side. At heart, e-discovery is a process that is ripe for outsourcing. Companies will create models whereby they can host not only the data but also the platforms to create an end-to-end solution, even if it’s not available from a single vendor.