Drawing Aces: Finding the best hand to bet high on machine learning software for legal
Several U.S. law firms have recently announced machine-learning (ML) software initiatives. Others are undoubtedly in the game but more reticent about exploring the capabilities and benefits of ML algorithms. An earlier column in this series discussed the uses of such software and another their mathematics. Here, we consider what a law firm needs to have in place to play in the transformative world of machine learning software. (For law departments the same fundamentals apply but to simplify the writing this column refers to law firms.)
Drawing on poker metaphors throughout this column, we could say that law firms must deal themselves ACES: Algorithm programmers, Champions, Experts, and Sources of data. Let’s explain those mnemonic “cards” and then even play them.
The Cards to Draw
Champion: Your firm needs a partner who is influential and exudes enthusiasm to push the initiative. Ideally the champion will proselytize machine learning and secure funding. The champion ought to be persuasive, eager to learn new computational tools, and convey a vision of how the firm should take advantage of the evolving capabilities of machine learning for legal management.
The champion sponsor holds a stronger hand if he or she grasps what machine learning software can do (and its limitations), indeed even how it accomplishes it. It’s a royal flush if the champion also sits on top of a data trove, the fundamental we discuss next.
Data: Your firm needs to have numbers that will help in management decisions in a format that software can handle. Ideally, the data has been collected in one or more spreadsheets, but database repositories can also contribute. Despite the hype about “big data,” law firms don’t possess such large-scale pools of data. Still, you can actually do useful analyses and, more fruitfully, make predictions with modest amounts of data. For example, with a spreadsheet having details on 50 or more closed cases or matters of a similar type, you fill the straight!
What is also important about legal management data, however, is that it be reasonably “clean.” Clean means it cannot have too many missing values or different styles in cells of the same column. If the fees column has cells with a dollar sign and other cells without, for example, the software algorithms will probably not perform. Or if some cells in the spreadsheet show “NA” when data is not available but others show “—“, you need to clean that. Clean data is also reasonably accurate data (not a data entry error that some associate billed 4,299 hours last year!), and not pockmarked with bizarre values (called “outliers”).
Your firm not only needs a sizeable data set, but you will also probably supplement it. For example, you might want to combine time-and-billing data or input information from your HR system. Examples include adding a variable for years of experience of lawyers or the concentration of an industry’s companies.
Subject Matter Expert: Your firm will need at the ML table a lawyer who not only supports the initiative but also qualifies as a “subject matter expert” (SME). This expert can look at the data set and understand the relative importance of pieces of it, what’s missing or odd, and what the firm might learn from it. A SME can translate in-the-trenches reality to the champion and programmer. For instance, looking at a set of information about certain kinds of cases, a subject matter expert could point out that the tenure of the judge --senior, mid-career, newly appointed – seems likely to correlate with the decision. Or, a SME might say that the duration of a case is not particularly useful because there are long stretches where neither party takes any actions. Even more usefully, a SME could classify matters as successful, unclear, or unsuccessful so that the ML software can tease out patterns and influential variables. This is an important task when you want to classify new matters or have the software figure out the pattern of facts that are more likely to result in success.
Programming and IT Support: You can bet that an ardent champion, ample data and insightful SME aren’t enough. You also need programming, perhaps from a consultant or an employee. Programmers or consultants aren’t cheap, but they are crucial. Also crucial is that any coding be work-for-hire, heavily commented so that someone else can follow the steps and logic, and adhere to the tenets of reproducible research.
Your firm will need to choose software that can carry out ML analyses. Those algorithms exceed the capabilities of Excel, but many other choices exist. This author relies on the open-source R programming language which has been optimized for statistical analyses and data visualization. Another open-source choice would be Python and many commercial packages jostle in the market.
You should also deal in your firm’s Chief Information Officer. Partly that is because that person understands the data maintained by the firm and how it can be exported and partly because the person is familiar with programmers and coding efforts.
Playing the Game
Ok, even with a powerful champion, fertile data, an SME steeped in the practice, and a knowledgeable programmer, you might as well fold if you can’t bid skillfully — run the project smoothly. Here are a few important techniques to spread understanding, put your foot in the water, coordinate the effort, market, and pay for it.
Education: At this early stage of law firms exploring predictive analytics, it is very important to explain what the benefits are and how the firm can achieve those benefits. The domain of data, software, statistics, programming, and algorithms will be mostly unfamiliar within your firm and explanations will be welcome. A partners’ off-site is a good opportunity to raise awareness and attract supporters.
Pilot: As with most change initiatives, your firm should start with a pilot study and learn from it before you roll out a more ambitious project. A practice group that wants to be able to predict results or costs or duration of matters from a subset of its past matters would be a good choice. Or the HR group might apply multiple regression on data to reduce attrition or understand better who makes partner.
Client: If you are using data of one or more clients, you don’t have to tell them that you are analyzing their data. Obviously, you may not disclose proprietary data. You might want to wait to share some findings with them once you have useful insights. That said, it may be that a client would be able to contribute to your data set information from other law firms handling similar matters where the data set they provide has been redacted appropriately. They would also probably commend your firm as innovative and progressive. For this reason, the marketing department will want to understand what is going on and suggest how best to make use of the effort and results.
Project Management: If IT, a Practice Group, HR, Marketing and a champion all have roles in a machine learning initiative, it will likely either bog down or take far too much time and money. Someone needs to coordinate meetings, decisions and timelines.
Funding: Sad, but true: You will need to ante up to find out whether and how your firm can take advantage of machine learning. To continue the metaphor, be patient and match or raise the chips that have been bet.
Machine learning investments will be a bit of an entrepreneurial gamble. The time and money involved is unclear but the pot — more fees or better firm management decisions — could be rich. Large U.S. companies are into artificial intelligence applications of all kinds, so forward-thinking law firms that exploit their legal management data will rake in the chips.
Rees Morrison is a principal with Altman Weil. One of his specialties is data analytics for law firms and corporate law departments. Hear more on this topic in a 60-minute Altman Weil webinar, Predictive Analytics: A New Tool for Law Firm Leaders.
This article originally appeared in Law Technology News, February 2017. Copyright 2017. ALM Media Properties, LLC. All rights reserved.