About

CaseLaw is about using previous judgements to deliver new ones. It's about links between documents. At HUDOC, each document is accompanied by a list of documents that are referenced (Strasbourg Caselaw).

If we look at references from the opposite side, we can determine how a specific document is being referenced by others (we call it citations). Providing a quantitative analysis of a document's citations may help to "calculate" the caselaw "value" of a document.

The ability to list the citations of a document is one of the key features of the project: it imporoves navigation across documents and tell us in which context a document is being used.

What has been done

  • All documents (English/French) available in HTML format downloaded from HUDOC up to December 2011 (some docs missing due to download issues)
  • Text processor: split the document into sections and identify references to other documents at paragraph level
  • Search engine: fulltext + faceting
  • Statistical module: filtering + pivoting + live grid

Plans for the future

  • develop document viewer
  • user accounts: bookmark document paragraphs, comments, user groups
  • notifications: when new documents appear with references to a monitored document. Search results subscribtion.
  • analytical tools
  • native iPhone/Android applications: save documents for offline work.

1. Search finding documents

The search page displays found documents on the left. Active filter and available facets are located on the right side:

The order of items inside a facet can be ordered alphabetically or by the total number of documents. When more than two values are selected for Violated and Non-Violated facets, a chain icon will appear next to the facet title. It denotes the search mode for the facet: OR, AND mode. AND mode will find documents that feature All selected values. In the screenshot below documents where Article 2 OR Article 3 will be returned. Click the chain icon to switch to AND mode.

Found documents are displayed in a compact mode, showing only important information: Importance, date, violated & non-violated articles, body, type of document and the total number of citations.

Clicking the show/hide document card button will expand the document by showing more information. The citations are grouped by violated & non-violated articles and a heatmap is generated. The stronger the color, the more citations for that specific article. Move your mouse over articles and a hint will be displayed with the number of documents (these are actually judgements, because decisions doesn't feature violations).

2. Document the page of a decision/judgement

The page of a document is comprised of several blocks:



Let's open BROGAN AND OTHERS v. THE UNITED KINGDOM.

Expand the first document in the References list: "BOUAMAR v. BELGIUM". Notice the color of the judgement's attributes: common articles, conclusions and keywords are positioned at the top in black color, the rest follows in gray. This helps distinguish how a given document differs from the main one. The same principle applies for other lists: Citations and Similar documents.

The list of referred documents is always within reasonable limits. Let's say 20-30 references, but not hundreds. For this reason all references are displayed (notice the "Show all" link at the bottom of References block).

The number of citations can be big. The most cited judgement is FRYDLENDER v. FRANCE with 2790 citations. Although you may scroll the whole citation list, we've provided a breakdown by keyword, article, conclusion, country and year. Notice that the breakdown by article doesn't mean violated/non-violated. These are the articles mentioned in the document notes at HUDOC. Violated/non-violated articles can be found in the Conclusion breakdown.

Below citations similar documents are listed. Let's define what similarity means. The table below depicts two documents and their properties (articles, keywords, conclusions). Overall, these documents have 7 things in common, we say that the similarity degree = 11. What we do is calculate the similarity degree of a given document in respect with all other documents in the database. We sort the list in a descending order of similarity degree and show the top 10.

  Document 1 Document 2 Similarity
Articles 13; 15; 5-2; 5-3; 5-4; 5-5 13; 5-1-c; 5-2; 5-4; 5-5; 41 4
Keywords
  • (art. 5-1) lawful arrest or detention
  • (art. 5-1) liberty of person
  • (art. 5-2) prompt information
  • (art. 5-3) trial within a reasonable time
  • (art. 5-5) compensation
  • (art. 15-1) derogation
  • (art. 5-1) security of person
  • (art. 5-1) lawful arrest or detention
  • (art. 5-1) liberty of person
  • (art. 5-2) prompt information
  • (art. 5-4) take proceedings
  • (art. 5-5) compensation
  • (art. 13) effective remedy
4
Conclusions
  • Just satisfaction reserved
  • No violation of Art. 5-1
  • No violation of Art. 5-4
  • Not necessary to examine Art. 13
  • Violation of Art. 5-3
  • Violation of Art. 5-5
  • Just satisfaction reserved
  • Violation of Art. 5-5
  • No violation of Art. 5-2
  • Not necessary to examine Art. 13
  • Not necessary to examine Art. 5-4
  • Violation of Art. 5-1
3
Total 11

It takes some time to compute the similarity list. A "Generate" button is displayed if the list is not ready yet. Wait a few seconds after clicking it. Once generated, the list will be displayed next time the document is accessed. Notice the and icons, it denotes that a similar document is a Citation or a Reference. To see how similar two documents are, expand the document card and see how many attributes are displayed in black color.

3. Statistics filter and pivot data, heatmap visualization

The statistics page uses the pivot tables approach with filtering: You first choose filter criteria, then by which attributes to generate the report. The cells in the table are live: you can click a cell and a search page will open with exactly the documents represented by the cell. Read the Help section on the Statistics page.

The table below shows all ECHR documents by State and Year, it's too big to fit here, go to State x Year report page. For your visual convenience, a more compact heatmap is displayed below the table. Cells in the heatmap are also clickable.

Let's build a generate a report for all documents that contains the word evictions in the text: report for: evictions. From the report we see that the highest number of documents are for Italy, 2002.

What if we are interested only in documents of 1st Level of Importance: report for: evictions, Importance=1. In this case Italy has only one document, while United Kingdom, 2001 has the highest number: 5.

The examples above illustrated the use of filters to build a report. Once a report is built, we want to see how a specific row compares to other rows. An example will explain this: here is a report for: Importance=2.

It easy to spot that the number of documents for Russia and Turkey increased substantially since 2004.
Now we wonder how a specific country compares to other countries, click Moldova, we see that since 2005 it started to have more documents. For even a better picture, sort the report by the Total column, it will display states that overall have more documents than Moldova, thus moving the "red" part of the heatmap to the top, and the "green" part to the bottom.

The examples above illustrate basic usage of the stats module. Here is a more complex one: Let's see if there are any patterns for judgements where Article 2 was found violated by mapping them in a Violated x Non-Violated matrix:
Judgements, Article 2 violated

The stats gives us the following information:

  • There are 138 such judgements
  • In 80 of these judgements Article 2 was Violated as well as Non-Violated
  • In most of the judgements, Article 14 was Non-Violated (41 judgements)
  • Article 2 is violated more frequently in conjunction with Article 13 (37 judgements)
  • When a judgement violates both Article 2 and 13, in most of the times Article 3 is found Non-Violated

Roadmap

The functionality described above doesn't reveal what would be possible in the context of one single document. That is our target for the future: to provide efficient and easy to use tools for interacting with the text of a document.

1. User accounts comments, favourites, bookmarks, user groups

  • The ability to add comments for a document or even a specific paragraph of a document
  • Comments may be Private, Public or Shared with other users
  • Bookmark paragraphs within a document
  • Add a document to the favourites list

2. Document viewer explore references/cited paragraphs

It was mentioned that documents are parsed at the paragraph level. The figure below demontrates the output of a parsed document, click the picture to get the full text of CASE OF ILA┼×CU AND OTHERS v. MOLDOVA AND RUSSIA.

The parser determines different section types, splits the text into separate paragraphs(note that each paragraph is enclosed in a dark rectangle). Each section (header, conclusions, dissenting opinions, footer) has it's own color (it means that the parser understands the structure of a document). References are transformed into links (i.e. clicking a reference may open it or just load on the fly only the referenced paragraph.).

As a result of automatic parsing, we can list documents (and even their paragraphs) that references a specific paragraph of a given document. A working example can be found at IHRDA CaseLaw Analyser for ACHPR decisions and other mechanisms. IHRDA CaseLaw Analyser is a HURIDOCS project.

Open the decision Commission nationale des droits de l'Homme et des libertes / Chad, notice the blue arrows on paragraphs 21, 25. This indicates that other documents are referencing these specific paragraphs. Click the blue arrow of the 21st paragraph, 3 documents will appear. At the right side of each document the corresponding paragraphs will be shown. Clicking them will display these specific paragraphs that references the 21st paragraph of the opened document

3. Graphs visualize document or even paragraph links

To give an overview of relationships between documents, a citation graph is built for every document: imagine a document represented by a circle, the diameter of the circle is proportional to the number of citations (the bigger the circle, the more citations a document has). All citations of this document are represented as other circles around the main one, and so on: citations of citations constitutes the second "line":
Click the image to open a "live" graph: move your mouse over circles to see the title of the document and the total number of Citations and References. Clicking a circle will open the corresponding document. The legend:
  • Red: importance level 1
  • Green: level 2
  • White: level 3
The graph above is the judgement ILASCU AND OTHERS v. MOLDOVA AND RUSSIA. The judgement itself has 115 citations, and we recursively "expand" each circle by showing its citations (N.B. a document is displayed only once in the graph). So we end up with a lot of circles in the graph, that might be used mostly as a nice visualization rather than some practical meaning. A logical step would be to exclude some of the documents (or to include documents) based on some rules. The next graph includes only Judgements that have violated Article 3, Article 5 and Non-Violated P1-1.

The "noise" in the graph can be reduced by excluding documents with zero citations (the small dots). We're not yet sure these graphs will be used by lawyers. In our opinion it may help to:

  • show how the "knowledge" within a document is spread in the future throughout citations.
  • compare the significance of two or more documents (in a specific context: specific violated articles for ex) at a deeper level by showing the propagation of citations.
  • navigate documents in a visual and compact way.

4. Tracking/monitoring new search results, documents of an application, users, new citations

The following features seems to useful:

  • get notifications when a document is referenced by a newly added one.
  • subscribe for advanced search results: get by email/rss new documents from a specific country, that have violated specific articles and contain a phrase/words.
  • get notifications when your friends are adding public/shared comments for documents.

5. Monetize project freemium mode: pay for advanced features

We plan to build a sustainable project, we didn't think yet about private funding/investors etc cause we feel we have enough resources to launch this startup and feel a "financial impact" directly (by having Pro accounts when a user pays for advanced features) or indirectly (advertisment, new contacts etc).

Conclusions

There is a lot of knowledge behind relationships between documents. For ECHR collection, we are grouping the citations list by different fields, the most important are the Violated/Non-Violated articles. By transforming a static text document into a set of sections/paragraphs that has links with paragraphs from other documents, we are actually giving "life" to these texts, making them more accessible(easy navigation) and useful(exposing knowlege based on relationships).

The principles described here may be applied to any document collection and are not limited to caselaw.