top of page

Subscribe to our newsletter

Write a
Title Here

I'm a paragraph. Click here to add your own text and edit me. I’m a great place for you to tell a story and let your users know a little more about you.

© Indic Pacific Legal Research LLP.

For articles published in VISUAL LEGAL ANALYTICA, you may refer to the editorial guidelines for more information.

Integrating the Taxonomies of Law & Coding using Catala

As technology continues to advance at an unprecedented pace, the legal industry has been struggling to keep up. Lawyers and legal scholars have been grappling with how to address the legal and ethical issues arising from the use of emerging technologies such as artificial intelligence and blockchain. One potential solution is to integrate the taxonomies of law and coding.

What exactly does that mean? Well, let's start with a taxonomy. In the context of law and coding, a taxonomy refers to the language and categories used to describe legal concepts and technological concepts, respectively. By integrating these two taxonomies, legal experts can better understand how technological systems function and how they can be regulated, while coders can better understand the legal and ethical implications of their work.

For example, consider the use of artificial intelligence in the legal industry. AI systems can be used to analyze vast amounts of legal data, identify relevant case law, and even assist with legal drafting. However, these systems also raise a host of legal and ethical issues. For example, how can we ensure that AI systems are unbiased and do not perpetuate existing biases in the legal system? How can we ensure that AI systems do not violate privacy rights or other legal protections?

By integrating the taxonomies of law and coding, legal experts can better understand the inner workings of AI systems and identify potential legal and ethical issues. Coders, in turn, can work to develop AI systems that are designed with these issues in mind, ensuring that they are compliant with applicable laws and regulations. Interestingly, there is a paper that addresses this aspect, introducing a programming language for legal documents. It is called Catala.

"Catala: A Programming Language for Legal Documents" is a recent paper published on arXiv by Pierre-Louis Gottfrois and his colleagues. This paper discusses how Catala can help bridge the gap between legal and technical experts and provide a framework for creating legal applications that are more efficient and accurate.

We all are aware that the text of the law is regarded as a set of rules and to turn that into the program, challenges were faced by legal professionals and developers when attempting to create legal applications. The legal language is complex and nuanced, making it difficult for non-experts to understand and apply it effectively. Additionally, the lack of standardization in legal language and the variations in a jurisdiction can cause further confusion. To address these challenges, the author introduces Catala, a programming language that is designed specifically for legal applications. The language is intended to be more accessible to legal professionals while still providing the flexibility and precision needed for technical development. As proposed, the language is designed to be human-readable, allowing legal professionals to write rules in a format that is familiar to them. The language also includes built-in features such as variables, functions, and logical operators to allow for more complex legal rules to be written. The authors argue that the use of Catala can help prevent errors and inconsistencies, which can have as it makes natural and easy to express the general case with an expectation that permeates statutory law.

This article offers a critical evaluation of the paper's contribution to artificial intelligence governance.

What is Catala?

The paper "Catala: A Programming Language for the Law" proposes a new domain-specific programming language called Catala, which is specifically designed to handle legal text. The authors argue that traditional programming languages are often unsuitable for dealing with legal concepts and can lead to errors and inconsistencies in legal documents and software applications. Catala aims to address these issues by providing legal professionals with a tool that is easy to use, accurate, and reliable.

The paper begins by outlining the challenges of working with legal text. The legal text is often complex and contains a large amount of jargon and technical language that can be difficult to understand. Furthermore, the legal text is subject to change over time, which can make it difficult to maintain software applications that rely on legal documents. These challenges make it difficult for legal professionals to create accurate and reliable software applications that rely on legal text.

The authors then introduce Catala as a solution to these challenges. Catala is designed to be easy to learn and use, with a syntax that is similar to natural language. This makes it easier for legal professionals to write and understand code. The language also includes a number of features that are specifically designed to help reduce errors and improve the accuracy of legal documents.

One of the key features of Catala is its ability to handle legal concepts such as time, events, and obligations. These concepts are often difficult to represent in traditional programming languages, which can lead to errors and inconsistencies in legal documents. Catala provides a number of features, such as natural language expressions, that make it easier to represent legal concepts accurately.

Another important feature of Catala is its ability to handle legal uncertainty. Legal text often contains ambiguity and uncertainty, and traditional programming languages are not well-suited to handling these issues. Catala includes features that allow legal professionals to represent uncertainty in a clear and consistent way.

Courtesy: Catala: A Programming Language for the Law,
Figure 1: Courtesy: Catala: A Programming Language for the Law,

Before going into the programming language we have to understand the structure and logic how the laws are written and the way they are written is very different from the control flow that we use in the programming language (Figure 1). As the law in a way it is written it frequently changes the meaning of the previous definitions and then re-interprets in a very complex way that is way beyond the control flow of any other programming language.

The Purpose & Use of Catala

The authors provide several examples of how Catala can be used. One example is the creation of legal chatbots. Chatbots are becoming increasingly popular in the legal industry, as they can help to provide legal advice and support to clients. However, creating chatbots that are accurate and reliable can be a challenge. Catala provides a framework for creating legal chatbots that are accurate, reliable, and easy to use.

Courtesy: Catala: A Programming Language for the Law,
Figure 2: Courtesy: Catala: A Programming Language for the Law,

An example is from Section 121, US Internal Revenue Code (in Figure 2) which has to do with how much of the proceeds of the sale of a house are exempt from taxes the general structure is that the law enumerates the conditions under which this exclusion can be applied but then also follows it up with a number of conditions which either modify or nullify this exclusion. The point is that you cannot simply read the first paragraph of the law which specifies a straightforward dollar amount for the exclusion and understand the entirety of it. You have to read the entire statute because paragraphs further down will change the meaning of this paragraph and the conditions under which it is applicable.

Courtesy: Catala: A Programming Language for the Law,
Figure 3: Courtesy: Catala: A Programming Language for the Law,

For example, the next paragraph in Figure 3 modifies this first paragraph in place and says that under certain conditions the exclusion can be more than what was specified in this first paragraph. Further paragraphs get more and more complicated into corner cases of exclusions and conditions and this is a very common pattern in the way laws are written in a way. Following this pattern of writing the general case first and then specifying all the special cases after that this kind of logic is called default logic. The variant that laws most commonly use is known as prioritized default logic where you have default values guarded by conditions and then a number of special cases.

Let's take a look at the language itself.

The very first thing we have to do is encode the things (refer Figure 3) before we get to the issue of how things are computed. If you take this tax law example in Figure 3, you may understand that we have to encode things like time periods with starting and ending dates. We also have to encode money in terms of the gains from the sale of that residence and then we have to encode certain conditions. At this point these are all just declarations (refer to Figure 4).

Catala: A Programming Language for the Law,
Figure 4: Courtesy: Catala: A Programming Language for the Law,

This looks very verbose but that was an explicit design choice. Why?

The syntax in Figure 4 was designed in close collaboration with lawyers and they preferred more verbose keywords which improved readability; for them it was very important that Catala is understandable for lawyers.

Once we get past the declarations of things like time periods and amounts of money, the next two important concepts in Catala are scope and context.

A scope roughly outlines the law's structure and the context tells us that various values are to be determined later depending on the exact context intuitively. Scope of any law or legal instrument can be thought of as functions and contexts can be thought of as parameters and local variables.

Courtesy: Catala: A Programming Language for the Law,
Figure 5: Courtesy: Catala: A Programming Language for the Law,

The example of the law in Figure 5 shows how scope of a legal instrument is framed. This syntax explains the conditions that a single person needs to satisfy. To get this exemption the syntax mentions the requirements on ownership and requirements on usage. Also, if both of those requirements are satisfied then the requirements for this context are fulfilled.

Looking at Figure 5, it seems we have an executable rendition of this part of the law if you provide the inputs which in this case are "gain from the sale of the property" and the "various time periods" of your occupation & residence. Catala's interpreter thus will compute the amount to be excluded from your income.

Since, the interpreter takes on the task of doing a control flow analysis and assigning values to variables in the correct order, finding cycles would be an error because the law is not supposed to have cyclic reasoning.

As the programming language is still being developed it is necessary that feedback analysis would help to improve the use case of the programming language. Now, below I have made a syntax based on the Section 54 of the Income Tax Act, 1961 (India), which is merely an attempt to make it executable inspired by the code related to the Section 121 of the US Internal Revenue Code of 1986. Further inspiration could be inferred to the resources I could find on GitHub. Let's understand how the syntax works.

 program Section54TaxDeduction;
rule Section54DeductionApplies(
  property: ResidentialProperty,
  salePrice: Currency,
  purchasePrice: Currency,
  dateOfSale: Date,
  dateOfPurchase: Date
) {
  // Check if the property is a residential property
  require property.isResidentialProperty();
  // Check if the property was held for more than 2 years
  require dateOfSale >= dateOfPurchase.addYears(2);
  // Calculate the capital gain on the sale of the property
  let capitalGain = salePrice - purchasePrice;
  // Check if the capital gain is greater than zero
  require capitalGain > 0;
  // Calculate the amount of deduction that can be claimed under Section 54
  let deductionAmount = min(capitalGain, amountInvestedInNewProperty());
  // Print the amount of deduction that can be claimed
  println("You can claim a deduction of " + deductionAmount + " under Section 54.");
function amountInvestedInNewProperty(): Currency {
  // Calculate the amount invested in a new residential property
  // This function should return the amount invested in the new property by the taxpayer
  // within the specified time limits as per Section 54 of the Income Tax Act
  // This can include the cost of the property, as well as expenses incurred on the transfer of property
  // For example, stamp duty, registration fees, legal fees, etc.
  return 0;

Now, this program defines a rule called "Section 54DeductionApplies" that takes in the necessary inputs for determining whether a taxpayer is eligible for a tax deduction under Section 54 of the Income Tax Act, 1961. The rule checks if the property is a residential property if it was held for more than 2 years, and if the capital gain on the sale of the property is greater than zero. If all conditions are satisfied, the program calculates the amount of deduction that can be claimed under Section 54 and prints the result.

The program also defines a function called "amount Invested In New Property" that calculates the amount invested in a new residential property by the taxpayer within the specified time limits as per Section 54 of the Income Tax Act, 1961. This function should be implemented to return the correct amount invested by the taxpayer in a new property.

Please note that this is just an example program, and it may need to be modified or extended to suit the specific requirements of a given situation. Additionally, it is important to consult with a qualified tax professional or legal expert to ensure compliance with all applicable laws and regulations.

Catala: A Programming Language for the Law,
Figure 6: Courtesy: Catala: A Programming Language for the Law,

Figure 6 (that is Figure 16 from the main paper) shows some promising results from a user study undertaken among lawyers.

  • The stats show that lawyers are able to understand the kind of code they were asked to read this code with much convenience. In fact, most of them said yes.

  • They were also asked if they can associate the code to the meaning of the law. Again, most of them said yes.

  • When they were asked if they could certify whether the code does exactly what the law says, the answers were kind of mixed. Yet, the authors hypothesized that it could be because these questions were put to a group of French lawyers whereas the law they had encoded was part of the US Tax Law regime.

As a case study, the authors encoded all of the "byzantine French family benefits" in Catalan (the programming language) and provided a web interface. On top of it they were able to verify the output of the Catalan implementation against the official state-sponsored simulator and found no issues. However, they did find one discrepancy and that pointed to a bug in the official state-sponsored simulator which was later fixed.

This was a quick look at this new programming language which is claimed to encode laws and the default logic that it is based on. Perhaps, the use of a language like Catala could be possible in a technical and procedure-oriented area like Taxation Law, which is why the Section 121 example is quite relatable.

Concluding Review

In conclusion, the paper does outline the benefits of using Catala. From a glance that I could have at the paper, this programme language certainly has the potential to reduce errors and inconsistencies in legal documents; it can also improve the accuracy and reliability of legal software applications. Furthermore, Catala can help to reduce the time and costs associated with creating and maintaining legal software applications. While Catala is still in its early stages of development, the authors believe that it has the potential to transform the way legal professionals work with legal text, which is appreciated. Perhaps, the paper could benefit from more critical evaluations of the potential limitations and challenges of using Catala in legal practice. For example, the paper does not address the potential difficulties in implementing Catala in legal practice, such as the need for legal professionals to learn a new programming language. Additionally, the paper does not provide an evaluation of the performance of Catala in real-world legal scenarios.