Skip to main content
Aug 13, 2013

Predictive coding and emerging e-discovery tools

The most expensive part of the e-discovery process is the review stage. Technology-assisted review, also known as predictive coding, can help lower costs.

E-discovery – the collection, analysis, preparation, review and production of electronically stored information that may be used in a legal proceeding – is not only complex, but also costly.

The most expensive part of the e-discovery process is the review stage, where every document the organization may have to produce in court or in an investigation must be read. ‘Nearly 75 percent of the e-discovery budget goes toward attorneys reviewing documents,’ says Manfred Gabriel, a principal with KPMG’s forensic technology practice. Increasingly, technology is evolving to meet companies’ endless quest to cut costs. 

In the past, armies of attorneys were needed to sift through documents, but this is no longer the case. Technology-assisted review, also known as predictive coding, uses the expertise of attorneys and machine-learning techniques to automate the prioritization of documents for review, based on how likely they are to be responsive to a particular matter.

Timothy Harkness, a partner with the law firm of Freshfields Bruckhaus Deringer, calls predictive coding ‘perhaps the key development this year’. It uses algorithms in much the same way that Amazon can offer you selections based on what you’ve bought in the past. Attorneys, who are subject matter experts, ‘teach’ the computer what’s relevant. ‘Machine learning gathers data, analyzes it and can make smart decisions about what’s relevant, quickly,’ explains Chris Forstner, head of the strategic discovery and information management group at the law firm of Murphy & McGonigle. ‘The computer does the heavy lifting.’

By using technology better, corporations can reduce costs and more effectively identify key documents. A number of US courts have endorsed the use of predictive coding, so computer review of documents should be expected to be a part of e-discovery in the future, says Harkness.

In a recent study of Fortune 1000 counsel by FTI Consulting, 57 percent of participants say they believe predictive coding will improve e-discovery and be a mainstream tactic by 2015. According to the study, entitled ‘Advice from counsel’, most respondents express optimism that predictive coding can better automate the document review process and dramatically reduce costs. 

‘New methods that harness advanced technology to conduct large-scale document review with little need for human involvement can save you millions of dollars,’ observes Adam Losey, member of the e-discovery & data management practice at Foley & Lardner.

Harkness recently worked on a case where predictive coding was used. ‘What would have cost $150,000 instead cost $20,000,’ he recalls. ‘Not only was the client happy, but also the associates were happy because they didn’t have to spend weeks and weeks going through duplicative emails and documents.’

Although there is a lot of buzz about predictive coding and its use is growing, the masses still haven’t rushed to jump on the bandwagon. ‘The majority of litigation is still done the old-fashioned way,’ says Losey.’

Predictive coding, after all, is a shift away from the traditional ‘eyeballs on documents’ approach and toward relying on a machine for identifying and categorizing documents as either responsive or non-responsive, notes Joel Wuesthoff, director and e-discovery expert in consulting firm Protiviti’s litigation, restructuring and investigations practice. ‘Given the legal duty to perform such identification and analysis as is required to support a representation of a ‘reasonable inquiry’ into a client’s document corpus, many attorneys are reluctant to put their full trust in what they see as an unknown instrument,’ he says. Simply put, old habits die hard. 

Challenges and pitfalls

What’s more, a significant number of documents – 20,000 to 50,000, or even more – need to be reviewed before predictive coding becomes a worthwhile option. ‘You need to have a fairly big case for predictive coding to make sense,’ says Losey.

It’s not only case size that makes a difference – predictive coding doesn’t work well in cases where video, audio and pictures come into play, either. ‘They don’t jive with predictive coding,’ Losey says. ‘If you have, for example, a construction or patent litigation with tens of thousands of diagrams and photos, predictive coding may not be much help. It is not a magic button that solves all e-discovery problems.’ 

Nor is predictive coding just about buying the software; a new set of skills is required of the lawyers who use it. Experts say lawyers must be prepared to use a quantitative approach, and should also have an understanding of statistics. ‘The more sophisticated the technology, the more sophisticated the workflows have to be – otherwise it’s garbage in, garbage out,’ says Gabriel.

The value proposition of predictive coding is that it saves money by replacing expensive human reviewers with a mathematically precise and cost-effective technology. But ‘some vendors have seen this as a license to charge rather a lot of money for that technology, as long as it’s less than the cost of human reviewers,’ warns Eddie Sheehy, CEO of Nuix, a provider of information management technologies including e-discovery, electronic investigation and information governance software. Companies should be mindful that it is important to compare pricing before using the service.

Despite the challenges, however, predictive coding’s use should only continue to increase. ‘Clients are asking for predictive coding for large-scale litigation and investigations,’ notes Gabriel. ‘If you do it right, you can save a lot of money and mitigate risk.’

Other emerging e-tools

While predictive coding is huge, there are other emerging e-discovery tools. One of the most exciting developments involves coming up with innovative ways to combine existing technology to achieve better results, says Tom Barnett, managing director and e-discovery practice leader for Stroz Friedberg, a digital risk management and investigations firm.

For example, data extraction is a technology that goes beyond simple pattern-matching and provides the ability to make inferences based on a set of rules. ‘So instead of just determining that a given word or group of words exists in a document, data extraction will tell you what kind of document it is – for example, a letter – as well as who wrote it, who received it, what is being discussed, and so on,’ Barnett explains. From that information, various inferences can be made. Does the document discuss a legal issue? Is the document likely to be privileged? ‘Combining data extraction technology with predictive coding holds great promise,’ Barnett adds.

There is also technology that allows for ‘harvesting’ – collecting data remotely from laptops, servers and the like. Email threading saves a lot of money when trying to get data from emails. ‘With threading, you keep only the last email that has the entire chain of information, instead of a dozen or more individual emails,’ explains Losey. ‘As most cases involve a lot of email, the ability to avoid duplication is very valuable in litigation.’ 

With early case assessment (ECA), you can get a handle on data even before the discovery process. Legal teams can view potentially damaging documents before devising a discovery plan. ‘For example, you can peek into email before the process starts,’ says Jeff Seymour, principal and national service line leader for the discovery practice at Deloitte. This can be tricky, though. ‘How does it affect operational performance?’ Seymour asks. ‘You don’t want to bring a firm’s email system to its knees. There are also data privacy issues when you’re basically on an early fishing expedition. There are some challenges with ECA.’ 

A computer with quick translation capabilities can give you a rough idea of what is being said, even if it’s in a different language, so that what is essential can be vetted. ‘Even if it’s not exact, it helps cut out some of what you don’t need, which saves money,’ says Harkness.

Forstner offers a last piece of sobering advice: ‘Be careful not to get too caught up in the hype. Work with an expert who has used the technologies before and knows how to run efficient and defensible workflows and processes that produce the right results for you.’

Motorola’s E-discovery Journey

Elizabeth Jaworski, director of legal operations for Motorola, gets numerous calls every week from vendors selling e-discovery solutions. ‘They all tell me they can make my life easier and save me a lot of money,’ she says. ‘There are so many companies offering services that it’s daunting.’

Motorola has used clustering and filtering of documents, which have helped tremendously in the company’s efforts to organize its data. ‘We haven’t tracked how much these technologies have saved us but, when you’re organized, there are likely to be savings,’ Jaworski says.

With so many services to choose from, Motorola is in the midst of a project in which it is using three different types of technologies for a mock case. One is a rule-based technology, another is technology-assisted review (predictive coding), and the third is solely human review. 

‘We are comparing workflows and results,’ Jaworski says. ‘We are not trying to see which technique is best; rather, we are trying to get an understanding of which technology works best for which type of data.’ The goal, she explains, is that after the test project is completed, her team will have the information it needs to help direct its decision making about e-discovery tools. 

‘One size does not fit all in this space,’ she says. ‘You must understand your data and your objectives. For example, do you need to have your eyes on the documents fast? And what are the priorities?’

What’s key, she notes, is that all parties in the process are on the same page. On one occasion Motorola selected a technology because of the vendor’s clustering functionality, ‘but the firm didn’t use it; it never used the bells and whistles,’ she recalls. ‘You need the right project manager.’

Even if figuring out technology is a bit of a process, it’s worth it, insists Jaworski. ‘Nothing is worse than human reviewers,’ she warns. ‘They are costly and have the highest rate of error. Find a way to use technology.’

Sheryl Nance-Nash

Sheryl is a freelance writer whose work has appeared in the New York Times, Forbes.com, ABCNews.com and many others