Follow me on TwitterMy Tweets
Idealistic musings about eDiscovery
Boy, I wish I could write like Craig Ball does.
I have written many articles and blog posts on technology-assisted review, but all my thousands of words cannot communicate my beliefs on the subject as gracefully, powerfully, and concisely as Craig recently put it:
Indeed, there is some cause to believe that the best trained reviewers on the best managed review teams get very close to the performance of technology-assisted review. …
But so what? Even if you are that good, you can only achieve the same result by reviewing all of the documents in the collection, instead of the 2%-5% of the collection needed to be reviewed using predictive coding. Thus, even the most inept, ill-managed reviewers cost more than predictive coding; and the best trained and best managed reviewers cost much more than predictive coding. If human review isn’t better (and it appears to generally be far worse) and predictive coding costs much less and takes less time, where’s the rational argument for human review?
So, um … yeah, what he said.
In my absence from the blawgosphere, many other commentators have described the metrics of technology-assisted review, or predictive coding, or whatever-we’re-calling-it-today much more eloquently than I could. However, the two primary metrics of TAR, Precision and Recall, still give lots of legal professionals fits as they try to apply it to their own sampling and testing iterations. So, for those of you still struggling with application of these concepts, here’s my explanation of these metrics (with some inspiration from a co-worker) in plain English:
Let’s imagine that we have a regulation deck of 52 playing cards. We want to locate all of the spade cards as quickly and cheaply as possible. So, we instruct our computer algorithms to:
With this information, our predictive computer correctly identifies five of the 13 spades in the deck, and correctly identifies all 39 non-spade cards. Because the computer predicted correctly 44 out of 52 times, or with 84.6 percent accuracy, we should be thrilled, right?
Uh … no.
Even though the computer’s predictions were almost 85 percent accurate across the entire deck, we asked the computer to identify the spade cards. Our computer correctly identified five spades, which means that the computer predicted spades with 100 percent Precision. (If the computer had “identified” six spades but one of them had actually been a club, for example, the Precision score would have dropped to 83.3 percent.)
However, look at the bigger picture. The computer identified only five of the 13 spades in the deck, leaving eight spades unaccounted for. This means that the computer’s Recall score — the percentage of documents correctly identified out of all the appropriate documents available – is a pathetic 38.5 percent.
Our 84.6 percent accuracy score won’t help us in front of the judge, and neither will our 100 percent Precision score by itself. The Recall score of 38.5 percent is a failing grade by anyone’s metric.
But let’s turn this example around. Remember, we also asked the computer to identify all NON-spades in the deck, which it did correctly 39 out of 39 times. As to non-spade cards, both our Precision and Recall scores are a whopping 100 percent – much better than that semi-fictional “accuracy” score listed above.
Analogizing this to document review, rather than having a human review all 52 cards to locate the spades, or rely on the computer to incompletely identify the spades in the deck, let’s run with our highest-scoring metrics and accept the computer’s predictions as to the non-spade cards. Now, instead of 52 cards, we only have to review 13 of them – a savings in review time (and costs) of 75 percent.
This 52-card example may seem overly simplistic, but multiply it by 10,000 decks of cards all shuffled together and suddenly, this exercise begins to look a lot more like the typical document review project. Technology-assisted review can slash huge amounts of time and expense from a document review, as long as we understand the limits of what it can – and cannot, depending on the circumstances – do for us.