A typical Friday night phone call tells you that your client has just identified an additional 80,000 documents that need to be considered for disclosure and the deadline is 48 hours away: that is often the reality in modern disputes. Under the new pilot disclosure rules, which will be the new normal from January 2019, parties will have an even greater obligation to trawl their growing data lakes for evidentially relevant issues even before particulars of claims enter into play.

What electronic devices are you carrying in your handbag or briefcase today? Perhaps a laptop, a USB stick or two and a mobile phone and tablet would not be unusual. When faced with a typical data set like this, harried investigators might be tasked with working through what could be as many as three quarters of a million user generated files – often emails, instant messages and calendar entries, all from just a single individual. Data volumes are exploding, and this does not even account for the increase in online data stores that now allow us to retain every document and photo we create. For the busy litigator, insolvency practitioner or forensic accountant faced with a complex investigation, timelines are shorter, costs are being squeezed and matters are more complex.

In the face of a deluge of structured and unstructured data, machine learning is the single most powerful tool available to investigators for extracting evidence within the time scales and budgets required. Lawyers and investigators are rightly curious when its use is suggested. How do we know it works? Can we trust its results? Is it an opaque ‘black box’ that opens us up to unknown unknowns and greater litigation risk? And what about lost billable revenue from having fee earners review documents?

Jan-Mar 2019 Issue

Grant Thornton UK LLP