Part of the Solution

Idealistic musings about eDiscovery

Category Archives: Education

eDiscovery: The Highly Unscientific Poll Continues

With all the views the eDiscovery acceptance poll in my last post has received, only eight votes (and one comment) have been counted. Since people can vote multiple times, I suspect only two or three people have offered their opinions. (I don’t expect a statistically-valid data set, but c’mon …)

I’m posting this one more time; please share your thoughts. Meanwhile, I’ve got an essay in the works regarding Ralph Losey’s magnum opus regarding Predictive Coding 3.0. (I’m reading all 15,000 words so you don’t have to!)

Why Is eDiscovery So Spooky?

 With Halloween around the corner, let’s try something different. Here’s a little poll, in which I ask why you think attorneys (as a whole) have been reluctant to embrace eDiscovery.

You may choose more than one option (and the more cynical of you may choose to select all of them), but I especially welcome your comments explaining what you think are the root causes of the profession’s ambivalence (including – especially – if you think adoption and acceptance are proceeding at exactly the right pace).

I’ll use the highly-unscientific results as the basis for a future post. Thanks for taking the time to participate!

RIP, Safe Harbor

“Decision 2000/520 is invalid.”

Those few words were all it took for the European Union Court of Justice (ECJ) to shoot down the “Safe Harbor” data agreement between the US and EU on October 6. To paraphrase the prolix paragraph that preceded those four words, the ECJ ruled that the Safe Harbor agreement notwithstanding, each EU nation still retained its power to review claims of personal breaches of data privacy rights; thus, the agreement has no effect.

From Hogan Lovells:

Safe Harbor was jointly devised by the European Commission and the U.S. Department of Commerce as a framework that would allow US-based organisations [sic] to overcome the restrictions on transfers of personal data from the EU.  Following a dispute between Austrian law student Max Schrems and the Irish Data Protection Commissioner, the [ECJ] was asked to consider whether a data protection supervisory authority was bound by the European Commission’s decision that Safe Harbor provided an adequate level of protection for European data.

Eric Levy summarized the fact situation nicely:

Schrems, an Austrian citizen and a Facebook user since 2008, alleged that Facebook should not be allowed to transfer the personal information of it subscribers from its Irish servers to servers in the US. In the light of revelations made in 2013 by Edward Snowden concerning the activities of United States intelligence services like the NSA, Schrems contended that the law and practices of the United States, including Safe Harbor, offered no real protection against surveillance by the United States of personal data transferred to that country. On October 6, 2015 the ECJ agreed with him.

The New York Times has also written a tight, more complete version of the back story.

According to Hogan Lovells, supra, the death of Safe Harbor means:

  • Transfers of personal data from the EU to the US currently covered by Safe Harbor will be unlawful unless they are suitably authorized by data protection authorities or fit within one of the legal exemptions.
  • Multinationals relying on Safe Harbor as an intra-group compliance tool to legitimize data transfers from EU subsidiaries to their US parent company or other US-based entities within their corporate group will need to implement an alternative mechanism.
  • US-based service providers certified under Safe Harbor to receive data from European customers will need to provide alternative guarantees for those customers to be able to engage their services lawfully.

So, instead of a single EU-wide privacy benchmark to apply when companies send foreign citizens’ personal data back to the US, each EU country can now apply its own standards for data privacy. This is likely to mean that some EU countries will suspend transfer of their citizens’ data to the US altogether.

During discovery, US judges had already shown a rather dismissive attitude toward foreign data privacy rights, so long as that data might prove discoverable in the US court. “I don’t care how hard it might be for you to get that data,” some judges had said, “that’s not my problem. It’s your case, and your data, so do it or face sanctions.” Huron Consulting had summarized:

Thus, U.S. courts where a lawsuit is filed and where the parties have appeared are likely to enforce U.S. rules of procedure regarding requests for discovery of information housed overseas, yet the countries where the information is housed may sanction parties who produce information protected by the privacy rules or without complying with the Hague Convention.

That was the best-case scenario under Safe Harbor. Now, the 28 EU nations previously bound by the agreement are free to apply their own data privacy rules to information housed in computers within their borders.

There is no “effective date” specified in the ECJ’s ruling, implying that Safe Harbor is dead as of now. However, Norton Rose Fulbright suggested prior to the ruling that panic is unnecessary:

If the ECJ finds that [Member State Data Protection Authorities (DPAs)] have the authority to make their own determinations as to whether certain types of transfers under the Safe Harbor are valid, there would be no immediate legal effect on the legality of transfers relying on the Safe Harbor. The Irish proceedings that gave rise to Schrems would continue, and other complaints would likely be filed to seek review by the Irish and other DPAs. While these proceedings could ultimately lead to data transfers being found invalid, this process would take months or years. Meanwhile, the European Commission would have more time to reach a new Safe Harbor agreement with the US, offering the DPAs an opportunity to find that the enhanced framework addresses their concerns.

If you have pending litigation involving electronic data that you thought your clients produced in compliance with their Safe Harbor certification, do your own research and reconsider your collection and production strategies in light of the meager guidance provided by the ECJ and in the references quoted here.

This is gonna get interesting.

Information Governance vs. eDiscovery

I have a new post up on Greg Buckles’ eDJ Blog on the intersection of information governance and eDiscovery.

Thanks, Greg, for the forum!

Why Hasn’t TAR Caught On? Look In The Mirror.

Oh, this is good. If you haven’t already signed up for the ALM Network (it’s free, as is most of their content), it’s worth doing so just to read this post (first of a two-part series) from Geoffrey Vance on Legaltech News. It pins the failure of acceptance of technology-assisted review (TAR) right where it belongs: on attorneys who refuse to get with the program.

As I headed home, I asked myself, how is it—in a world in which we rely on predictive technology to book our travel plans, decide which songs to download and even determine who might be the most compatible on a date—that most legal professionals do not use predictive technology in our everyday client-serving lives?

I’ve been to dozens of panel discussions and CLE events specifically focused on using technology to assist and improve the discovery and litigation processes.  How can it possibly be—after what must be millions of hours of talk, including discussions about a next generation of TAR—that we haven’t really even walked the first-generation TAR walk?

Geoffrey asks why attorneys won’t get with the program. In a comment to the post, John Tredennick of Catalyst lays out the somewhat embarrassing answer:

Aside from the fact that it is new (which is tough for our profession), there is the point that TAR 2.0 can cut reviews by 90% or more (TAR 1.0 isn‘t as effective). That means a lot of billable work goes out the window. The legal industry (lawyers and review companies) live and die by the billable hour. When new technology threatens to reduce review billables by a substantial amount, are we surprised that it isn‘t embraced? This technology is driven by the corporate counsel, who are paying the discovery bills. As they catch on, and more systems move toward TAR 2.0 simplicity and flexibility, you will see the practice become standard for every review.

Especially with respect to his last sentence, I hope John is right.

If ESI Isn’t Inaccessible, Better Speak Up

I don’t know if I’m more impressed that the author’s name is “Gary Discovery”, or that the ESI_logo[1]wisdom contained in his note is so cogent, but this author cites a new Pennsylvania case in which the judge presumed ESI to be inaccessible where neither party contended otherwise. In this case, the result was that the costs of production shifted to the requesting party.

The requesting party should submit to the court that the ESI sought is accessible to avoid both a presumption of inaccessibility and the possibility of cost-shifting.  Requesting parties should not leave it up to the producing party to bear the burden of showing that the ESI is inaccessible because the courts are now willing to presume this finding if neither party contends otherwise.

Craig Ball, Predictive Coding, and Wordsmithing

Boy, I wish I could write like Craig Ball does.

I have written many articles and blog posts on technology-assisted review, but all my thousands of words cannot communicate my beliefs on the subject as gracefully, powerfully, and concisely as Craig recently put it:

Indeed, there is some cause to believe that the best trained reviewers on the best managed review teams get very close to the performance of technology-assisted review. …

But so what?  Even if you are that good, you can only achieve the same result by reviewing all of the documents in the collection, instead of the 2%-5% of the collection needed to be reviewed using predictive coding.  Thus, even the most inept, ill-managed reviewers cost more than predictive coding; and the best trained and best managed reviewers cost much more than predictive coding.  If human review isn’t better (and it appears to generally be far worse) and predictive coding costs much less and takes less time, where’s the rational argument for human review?

So, um … yeah, what he said.

Technology-Assisted Review: Precision and Recall, in Plain English


deck (Photo credit: pro_cyp)

In my absence from the blawgosphere, many other commentators have described the metrics of technology-assisted review, or predictive coding, or whatever-we’re-calling-it-today much more eloquently than I could. However, the two primary metrics of TAR, Precision and Recall, still give lots of legal professionals fits as they try to apply it to their own sampling and testing iterations. So, for those of you still struggling with application of these concepts,  here’s my explanation of these metrics (with some inspiration from a co-worker) in plain English:

Let’s imagine that we have a regulation deck of 52 playing cards. We want to locate all of the spade cards as quickly and cheaply as possible. So, we instruct our computer algorithms to:

  1. identify all spades in the deck; and
  2. identify all non-spades in the deck.

With this information, our predictive computer correctly identifies five of the 13 spades in the deck, and correctly identifies all 39 non-spade cards. Because the computer predicted correctly 44 out of 52 times, or with 84.6 percent accuracy, we should be thrilled, right?

Uh … no.

Even though the computer’s predictions were almost 85 percent accurate across the entire deck, we asked the computer to identify the spade cards. Our computer correctly identified five spades, which means that the computer predicted spades with 100 percent Precision. (If the computer had “identified” six spades but one of them had actually been a club, for example, the Precision score would have dropped to 83.3 percent.)

However, look at the bigger picture. The computer identified only five of the 13 spades in the deck, leaving eight spades unaccounted for. This means that the computer’s Recall score — the percentage of documents correctly identified out of all the appropriate documents available – is a pathetic 38.5 percent.

Our 84.6 percent accuracy score won’t help us in front of the judge, and neither will our 100 percent Precision score by itself. The Recall score of 38.5 percent is a failing grade by anyone’s metric.

But let’s turn this example around. Remember, we also asked the computer to identify all NON-spades in the deck, which it did correctly 39 out of 39 times. As to non-spade cards, both our Precision and Recall scores are a whopping 100 percent – much better than that semi-fictional “accuracy” score listed above.

Analogizing this to document review, rather than having a human review all 52 cards to locate the spades, or rely on the computer to incompletely identify  the spades in the deck, let’s run with our highest-scoring metrics and accept the computer’s predictions as to the non-spade cards. Now, instead of 52 cards, we only have to review 13 of them – a savings in review time (and costs) of 75 percent.

This 52-card example may seem overly simplistic, but multiply it by 10,000 decks of cards all shuffled together and suddenly, this exercise begins to look a lot more like the typical document review project. Technology-assisted review can slash huge amounts of time and expense from a document review, as long as we understand the limits of what it can – and cannot, depending on the circumstances – do for us.

Legal Hold Notifications: Is E-Mail Good Enough?

An e-mail exchange this morning with one of our product managers has got me thinking … and that’s always dangerous.

Most enterprises that issue legal hold notifications to their custodian employees use good ol’ e-mail. The hold notification gets pasted into the body of the e-mail, and off it goes, (theoretically) to the recipient. Perhaps the e-mail was sent with a return receipt requested; and if the recipient is feeling generous, that receipt just might come back to prove that the message was received. As for whether the notification ever gets read? Well, we’ll just have to assume the best, won’t we?

Problem is, this isn’t a very practical solution. First, let’s look at the logistics. E-mails are not only subject to one-click deletion, but (at least in Microsoft Outlook and Exchange) can also be subject to custom routing rules. A user with the appropriate software permissions can easily create a rule to route all e-mails from, say, the Office of the General Counsel directly to the “Deleted Items” folder.

E-mail retention on the enterprise level also tends to be subject to shorter data preservation times than other types of electronic documents. If a custodian is on vacation when they get the e-mail, there is a possibility that the e-mail may not be there for download when the custodian gets back.

Finally, even if the e-mail is delivered in a timely manner and doesn’t get deleted, there’s no guarantee that the custodian will actually read it. And even if they do read it, they have the option (again, in Outlook; I don’t know about other systems) of denying the sender’s request to return a receipt.

The point of all this is that the enterprise remains vulnerable to “plausible deniability”. If a custodian can be shown to have read the hold notice, and they then proceed to violate it by spoliating evidence, the enterprise can likely protect itself from liability by arguing that in violating the hold notice, the custodian was acting outside the course and scope of their employment. Without that proof, the enterprise may remain firmly on the hook.

Now, the content of the hold notice itself is probably privileged from discovery under attorney-client privilege and attorney work produce privilege. However, the process of issuing that hold notice, and of obtaining proof of receipt, may not be privileged. In the recent case Cannata v. Wyndham Worldwide Corp., 2011 WL 3495987 (D. Nev. Aug. 10, 2011), the court held that the opposing party was entitled to know “what has actually happened in this case, i.e., when and to whom the litigation hold letter was given, what kinds and categories of ESI were included in defendant’s litigation hold letter, and what specific actions defendant’s employees were instructed to take to that end.” (Emphasis mine. I commend to you Dennis Kiker’s excellent discussion of the Cannata case at his blog.)

At Autonomy, our legal hold software notifies the custodian via e-mail that they have a message awaiting them from the GC’s office, and to click on an enclosed link. The link serves a form from our workflow management engine, completely independent from the e-mail, containing the language of the legal hold notification, and requiring an electronic signature as acknowledgment that the form has been received and read. This ensures that the custodian cannot claim, “I didn’t get the e-mailed notice; and if I did, I deleted it; and if I didn’t delete it, I didn’t read it, etc.,” as a way to shift liability for spoliation back onto the enterprise. To me, this seems a MUCH better and safer practice that is more likely to withstand judicial scrutiny. (This is my opinion, not influenced by anyone at Autonomy, and I firmly and personally believe what I have written here.)

The Zubulake V opinion (229 F.R.D. 422, 433 (S.D.N.Y. 2004)) set the standard quite plainly: ‘‘[A] party cannot reasonably be trusted to receive the ‘litigation hold’ instruction once and to fully comply with it without the active supervision of counsel.’’ So why do so many counsel continue to insist that an e-mail “blast” of hold notices is good enough? Food for thought …