Legal text analytics: some thoughts on where (I think) things stand

Researching the area of artificial intelligence and the law (AI & Law) has currently taken me to the complexities of natural language processing (NLP) applied to legal texts (aka legal text analytics). Trying to understand the extent to which AI can be used to perform automated legal analysis—or, more modestly, to support humans in performing legal analysis—requires (at least) a view of the current possibilities for AI tools to (i) extract information from legal sources (or ‘understand’ them and their relationships), (ii) assess their relevance to a given legal problem and (iii) apply the legal source to provide a legal solution to the problem (or to suggest one for human validation).

Of course, this obviates other issues such as the need for AI to be able to understand the factual situation to formulate the relevant legal problem, to assess or rank different legal solutions where available, or take into account additional aspects such as the likelihood of obtaining a remedy, etc—all of which could be tackled by fields of AI & Law different from legal text analytics. The above also ignores other aspects of ‘understanding’ documents, such as the ability for an algorithm to distinguish factual and legal issues within a legal document (ie a judgment) or to extract basic descriptive information (eg being able to create a citation based on the information in the judgment, or to cluster different types of provisions within a contract or across contracts)—some of which seems to be at hand or soon to be developed on the basis of the recently released Google ‘Document Understanding AI’ tool.

The latest issue of Artificial Intelligence and the Law luckily concentrates on ‘Natural Language Processing for Legal Texts’ and offers some help in trying to understand where things currently stand regarding issues (i) and (ii) above. In this post, I offer some reflections based on my understanding of two of the papers included in the special issue: Nanda et al (2019) and Chalkidis & Kampas (2019). I may have gotten the specific technical details wrong (although I hope not), but I think I got the functional insights.

Establishing relationships between legal sources

One of the problems that legal text analytics is trying to solve concerns establishing relationships between different legal sources—which can be a partial aspect of the need to ‘understand’ them (issue (i) above). This is the main problem discussed in Nanda et al, 'Unsupervised and supervised text similarity systems for automated identification of national implementing measures of European directives' (2019) 27(2) Artificial Intelligence and Law 199-225. In this piece of research, AI is used to establish whether a provision of a national implementing measure (NIM) transposes a specific article of an EU Directive or not. In extremely simplified terms, the researchers train different algorithms to perform text comparison. The researchers work on a closed list of 43 EU Directives and the corresponding Luxembuorgian, Irish and Italian NIMs. The following table plots their results.

Nanda et al (2019: 208, Figure 6).

The table shows that the best AI solution developed by the researchers (the TF-IDF cosine) achieves levels of precision of around 83% for Luxembourg, 77% for Italy and 68% for Ireland. These seem like rather impressive results but a qualitative analysis of their experiment indicates that the significantly better performance for Luxembourgian transposition over Italian or Irish transposition likely results from the fact that Luxembourg tends to largely ‘copy & paste’ EU Directives into national law, whereas the Italian and Irish legislators adopt a more complex approach to the integration of EU rules into their existing legal instruments.

Moreover, it should be noted that the algorithms are working on a very specific issue, as they are only assessing the correspondence between provisions of EU and NIM instruments that were related—that is, they are operating in a closed or walled dataset that does not include NIMs that do not transpose any of the 43 chosen Directives. Once these aspects of the research design are taken into account, there are a number of unanswered questions, such as the precision that the algorithms would have if they had to compare entire NIMs against an open-ended list of EU Directives, or if they were used to screen for transposition rules. While the first issue could probably be answered simply extending the experiment, the second issue would probably require a different type of AI design.

On the whole, my impression after reading this interesting piece of research is that AI is still relatively far from a situation where it can provide reliable answers to the issue of establishing relationships across legal sources, particularly if one thinks of relatively more complex relationships than transposition within the EU context, such as development, modification or repeal of a given set of rules by other (potentially dispersed) rules.

Establishing relationships between legal problems and legal sources

A separate but related issue requires AI to identify legal sources that could be relevant to solve a specific legal problem (issue (ii) above)—that is, the relevant relationship is not across legal sources (as above), but between a legal problem or question and relevant legal sources.

This is covered in part of the literature review included in Chalkidis & Kampas, ‘Deep learning in law: early adaptation and legal word embeddings trained on large corpora‘ (2019) 27(2) Artificial Intelligence and Law 171-198 (see esp 188-194), where they discuss some of the solutions given to the task of the Competition on Legal Information Extraction/Entailment (COLIEE) from 2014 to 2017, which focused ‘on two aspects related to a binary (yes/no) question answering as follows: Phase one of the legal question answering task involves reading a question Q and extract[ing] the legal articles of the Civil Code that are relevant to the question. In phase two the systems should return a yes or no answer if the retrieved articles from phase one entail or not the question Q’.

The paper covers four different attempts at solving the task. It reports that the AI solutions developed to address the two binary questions achieved the following levels of precision: 66.67% (Morimoto et al. (2017)); 63.87% (Kim et al. (2015)); 57.6% (Do et al. (2017)); 53.8% (Nanda et al. (2017)). Once again, these results are rather impressive but some contextualisation may help to assess the extent to which this can be useful in legal practice.

The best AI solution was able to identify relevant provisions that entailed the relevant question 2 out of 3 times. However, the algorithms were once again working on a closed or walled field because they solely had to search for relevant provisions in the Civil Code. One can thus wonder whether algorithms confronted with the entirety of a legal order would be able to reach even close degrees of accuracy.

Some thoughts

Based on the current state of legal text analytics (as far as I can see it), it seems clear that AI is far from being able to perform independent/unsupervised legal analysis and provide automated solutions to legal problems (issue (iii) above) because there are still very significant shortcomings concerning issues of ‘understanding’ natural language legal texts (issue (i)) and adequately relating them to specific legal problems (issue (ii)). That should not be surprising.

However, what also seems clear is that AI is very far from being able to confront the vastness of a legal order and that, much as lawyers themselves, AI tools need to specialise and operate within the narrower boundaries of sub-domains or quite contained legal fields. When that is the case, AI can achieve much higher degrees of precision—see examples of information extraction precision above 90% in Chalkidis & Kampas (2019: 194-196) in projects concerning Chinese credit fraud judgments and Canadian immigration rules.

Therefore, the current state of legal text analytics seems to indicate that AI is (quickly?) reaching a point where algorithms can be used to extract legal information from natural language text sources within a specified legal field (which needs to be established through adequate supervision) in a way that allows it to provide fallible or incomplete lists of potentially relevant rules or materials for a given legal issue. However, this still requires legal experts to complement the relevant searches (to bridge any gaps) and to screen the proposed materials for actual relevance. In that regard, AI does hold the promise of much better results than previous expert systems and information retrieval systems and, where adequately trained, it can support and potentially improve legal research (ie cognitive computing, along the lines developed by Ashley (2017)). However, in my view, there are extremely limited prospects for ‘independent functionality’ of legaltech solutions. I would happily hear arguments to the contrary, though!

New paper: ‘Screening for Cartels’ in Public Procurement: Cheating at Solitaire to Sell Fool’s Gold?

I have uploaded a new paper on SSRN, where I critically assess the bid rigging screening tool published by the UK’s Competition and Markets Authority in 2017. I will be presenting it in a few weeks at the V Annual meeting of the Spanish Academic Network for Competition Law. The abstract is as follows:

Despite growing global interest in the use of algorithmic behavioural screens, big data and machine learning to detect bid rigging in procurement markets, the UK’s Competition and Markets Authority (CMA) was under no obligation to undertake a project in this area, much less to publish a bid-rigging algorithmic screening tool and make it generally available. Yet, in 2017 and under self-imposed pressure, the CMA released ‘Screening for Cartels’ (SfC) as ‘a tool to help procurers screen their tender data for signs of illegal bid-rigging activity’ and has since been trying to raise its profile internationally. There is thus a possibility that the SfC tool is not only used by UK public buyers, but also disseminated and replicated in other jurisdictions seeking to implement ‘tried and tested’ solutions to screen for cartels. This paper argues that such a legal transplant would be undesirable.

In order to substantiate this main claim, and after critically assessing the tool, the paper tracks the origins of the indicators included in the SfC tool to show that its functionality is rather limited as compared with alternative models that were put to the CMA. The paper engages with the SfC tool’s creation process to show how it is the result of poor policy-making based on the material dismissal of the recommendations of the consultants involved in its development, and that this has resulted in the mere illusion that big data and algorithmic screens are being used to detect bid rigging in the UK. The paper also shows that, as a result of the ‘distributed model’ used by the CMA, the algorithms underlying the SfC tool cannot improved through training, the publication of the SfC tool lowers the likelihood of some types of ‘easy to spot cases’ by signalling areas of ‘cartel sophistication’ that can bypass its tests and that, on the whole, the tool is simply not fit for purpose. This situation is detrimental to the public interest because reliance in a defective screening tool can create a false perception of competition for public contracts, and because it leads to immobilism that delays (or prevents) a much-needed engagement with the extant difficulties in developing a suitable algorithmic screen based on proper big data analytics. The paper concludes that competition or procurement authorities willing to adopt the SfC tool would be buying fool’s gold and that the CMA was wrong to cheat at solitaire to expedite the deployment of a faulty tool.

The full citation of the paper is: Sanchez-Graells, Albert, ‘Screening for Cartels’ in Public Procurement: Cheating at Solitaire to Sell Fool’s Gold? (May 3, 2019). Available at SSRN: https://ssrn.com/abstract=3382270

An incomplete overview of (the promises of) GovTech: some thoughts on Engin & Treleaven (2019)

I have just read the interesting paper by Z Engin & P Treleaven, 'Algorithmic Government: Automating Public Services and Supporting Civil Servants in using Data Science Technologies' (2019) 62(3) The Computer Journal 448–460, https://doi.org/10.1093/comjnl/bxy082 (available on open access). The paper offers a very useful, but somehow inaccurate and slightly incomplete, overview of data science automation being deployed by governments world-wide (ie GovTech), including the technologies of artificial intelligence (AI), Internet of Things (IoT), big data, behavioral/predictive analytics, and blockchain. I found their taxonomy of GovTech services particularly thought-provoking.

Source: Engin & Treleaven (2019: 449).

Source: Engin & Treleaven (2019: 449).

In the eyes of a lawyer, the use of the word ‘Government’ to describe all these activities is odd, in particular concerning the category ‘Statutes and Compliance’ (at least on the Statutes part). Moving past that conceptual issue—which reminds us once more of the need for more collaboration between computer scientist and social scientists, including lawyers—the taxonomy still seems difficult to square with an analysis of the use of GovTech for public procurement governance and practice. While some of its aspects could be subsumed as tools to ‘Support Civil Servants’ or under ‘National Public Records’, the transactional aspects of public procurement and the interaction with public contractors seem more difficult to place in this taxonomy (even if the category of ‘National Physical Infrastructure’ is considered). Therefore, either additional categories or more granularity is needed in order to have a more complete view of the type of interactions between technology and public sector activity (broadly defined).

The paper is also very limited regarding LawTech, as it primarily concentrates on online dispute resolution (ODR) mechanisms, which is only a relatively small aspect of the potential impact of data science automation on the practice of law. In that regard, I would recommend reading the (more complex, but very useful) book by K D Ashley, Artificial Intelligence and Legal Analytics. New Tools for Law Practice in the Digital Age (Cambridge, CUP, 2017).

I would thus recommend reading Engin & Treleaven (2019) with an open mind, and using it more as a collection of examples than a closed taxonomy.

Procurement governance and complex technologies: a promising future?

Thanks to the UK’s Procurement Lawyers’ Association (PLA) and in particular Totis Kotsonis, on Wednesday 6 March 2019, I will have the opportunity to present some of my initial thoughts on the potential impact of complex technologies on procurement governance.

In the presentation, I will aim to critically assess the impacts that complex technologies such as blockchain (or smart contracts), artificial intelligence (including big data) and the internet of things could have for public procurement governance and oversight. Taking the main risks of maladministration of the procurement function (corruption, discrimination and inefficiency) on which procurement law is based as the analytical point of departure, the talk will explore the potential improvements of governance that different complex technologies could bring, as well as any new governance risks that they could also generate.

The slides I will use are at the end of this post. Unfortunately, the hyperlinks do not work, so please email me if you are interested in a fully-accessible presentation format (a.sanchez-graells@bristol.ac.uk).

The event is open to non-PLA members. So if you are in London and fancy joining the conversation, please register following the instructions in the PLA’s event page.