Navigation auf uzh.ch

Suche

Center for Legal Data Science

Useful Databases

The following is a list of freely accessible online databases that contains various legal data.

Swiss Datasets

Name  Link Description Legal Area
SCD (Swiss Federal Supreme Court Dataset)

zenodo.org

                                                                            

The Swiss Federal Supreme Court Dataset (SCD) provides a record of all 116,000+ cases decided by the Swiss Federal Supreme Court since 2007. The SCD includes 31 variables that document basic case information, the court composition, the area of law, information about the appealed judgment, the parties, the case outcome, and about citations and publication status. The dataset will be updated quarterly until at least 2025 to include the latest judgments and possible expansions. Size: 116,650 cases (expanding in future versions).

Geering Florian, Merane Jakob. (2023), Swiss Federal Supreme Court Dataset (SCD), Zenodo: https://doi.org/10.5281/zenodo.7793043.

All
Schweizer Open Government Data opendata.swiss Opendata.swiss is a central portal for open, freely accessible, data from the Swiss authorities (Open Government Data, OGD). The opendata.swiss portal guarantees users simple and secure central access to data of the federal government, cantons and municipalities. If there is a public interest, data from third parties – federally affiliated companies as well as private actors commissioned by the federal, cantonal or municipal authorities – may also be published. Size: 8792 Data sets (13.4.23). All
FSCS (Swiss court judgments)

Github

zenodo.org

Multilingual (German, French, and Italian), diachronic (2000-2020) corpus of 85'000 cases from the Federal Supreme Court of Switzerland (FSCS).

Niklaus Joel, Chalkidis Ilias and Stürmer Matthias (2021) “SwissJudgmentPrediction”. Proceedings of the 2021 Natural Legal Language Processing Workshop (NLLP), Zenodo: 10.5281/zenodo.5529712.

All
Coercive Measures Database Link

SRF surveyed the number of decisions made by the courts for coercive measures in all Swiss cantons in 2017 - categorised by
broken down by rejections and approvals and by area (pre-trial detention, preventive detention, alternative measures and secret surveillance measures).
Applications that were partially approved were counted as approvals. Both orders and extensions were counted as pre-trial detention (but not those relating to release from custody). 

A shortened period of pre-trial detention is reported by the ZMG as a partial authorisation; this was counted as an authorisation.
In the cantons of Fribourg and Geneva, the figures for pre-trial detention are reported together with preventive detention and alternative measures. Although almost all cantons responded, only 18 cantons recorded how many applications were approved or rejected.
These include five of the six most populous cantons - but the canton of Zurich is missing.In the canton of St. Gallen, the Regional Court of Wil was unable to provide any figures. Around 60 per cent of the Swiss population live in the cantons whose data is analysed.

Criminal Law
       

 

ECtHR Datasets

Name

 

Link Description Legal Area
ECtHR

Github

Dropbox 

Judicial Decisions of the European Court of Human Rights. Test20 (Art. 2-8, 10-14,18): 804 decisions, test_violations (Art. 2-8, 10-14, 18): 4054 decisions, Train (Art. 2-14,18): 8441 decisions.

Medvedeva Masha, Vols Michel, Wieling Martijn (2019): Github: https://github.com/masha-medvedeva/ECtHR_crystal_ball.

Human Rights Law
European Court of Human Rights Database (ECHRdb) Version 1.0 washington.edu

The ECHRdb is available as a set of downloadable raw data files in .csv, excel and Stata formats enabling researches to analyze the complete set of variables included in the database using a wide variety of data analysis software. The database includes over 70 variables detailing judicial decision patterns (subject matter, violation rate, defendant country, etc.) and organization participation and effects (organization identification, participation rates, types of participation, amicus participation, domestic legal change, etc.). The data are available as three datasets: Comprehensive, Judgment Centered and Participation Centered. These three datasets enable researchers to examine all the variables (Comprehensive) or narrow their analysis to focus on the judgment level patterns (Judgment Centered) or organization participation patterns (Participation Centered). Released 2017. Size: 15147 judgments.

Cichowski & E. Chrun (2017). European Court of Human Rights Database (ECHRdb), Version 1.0 Release 2017: http://depts.washington.edu/echrdb/.

Human Rights Law

 

EU Datasets

Name  Link Information Legal Area
CJEU Database (IUROPA)

Website

Download Tool

Database from the Court of Justice of the EU (CJEU): IUROPA is a multidisciplinary platform for research on judicial politics in the European Union (EU). It includes detailed information on cases, judgements and actors involved in the judicial processes of the CJEU.

European Law
JRC-Acquis

JRC-Acquis Overview

Dataset on emm4u.eu

The Acquis Communautaire (AC) is the total body of European Union (EU) law applicable in the the EU Member States. This collection of legislative text changes continuously and currently comprises selected texts written between the 1950s and now. The EU has 27 Member States and 23 official languages. The Acquis Communautaire texts comprises material in these languages, with the exception of Irish translations. Size: 463792 texts.

Steinberger Ralf, Pouliquen Bruno, Widiger Anna, Ignat Camelia, Erjavec Tomaž, Tufis Dan, Varga Dániel (2006).

European Law

 

US Datasets

Name 

Link Information Legal Area
US Supreme Court Database

scdb.wustl.edu

Legacy Database (1791-1945)

Modern Database (1946-2022)

Information about every case decided by the US Supreme Court between 1791 and today.

Size by Database: Legacy Database Case Centered Data (19681); Legacy Database Justice Centered Data (172213); Modern Database Case

Centered Data (13780); Modern Database Justice Centered Data (123447).

Harold J. Spaeth, Lee Epstein, Andrew D. Martin, Jeffrey A. Segal, Theodore J. Ruger, and Sara C. Benesh. 2022 Supreme Court Database, Version 2022 Release 01, URL: http://Supremecourtdatabase.or.

All

WCLD: Curated Large Dataset of Criminal Cases from Wisconsin Circuit Courts

Github

The WCLD is a curated large dataset of 1.5 million criminal cases from circuit courts in the U.S. state of Wisconsin. The creators used reliable public data from 1970 to 2020 to curate attributes like prior criminal counts and recidivism outcomes. The dataset contains large number of samples from five racial groups, in addition to information like sex and age (at judgment and first offense). Other attributes in this dataset include neighborhood characteristics obtained from census data, detailed types of offense, charge severity, case decisions, sentence lengths, year of filing etc. It also provides pseudo-identifiers for judge, county and zipcode. 

Elliott Ash, Naman Goel, Nianyun Li, Claudia Marangon, Peiyao Sun, WLCD: Curated Large Dataset of Criminal Cases from Wisconsin Circuit Courts, 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

Criminal Law

 

Various Datasets

Name Link Description Legal Area State/Organization
Corpus des Deutschen Bundesrechts

Github

zenodo.org

The Corpus of German Federal Law (C-DBR) is a comprehensive collection of the consolidated versions of all laws and ordinances at the federal level. The dataset uses www.gesetze-im-internet.de of the Federal Ministry of Justice as its source. Size: 6687 federal laws and ordinances of the FRG.

Fobbe, Sean. (2023). Corpus des Deutschen Bundesrechts (C-DBR) (2023-01-05), Zenodo: https://doi.org/10.5281/zenodo.7494474.

All Germany
ITC and ICJ Datasets

Overview and book excerpts

Datasets in R Data Format (Version 2.0.0 or later)

Codebook ICJ 

Codebook ITC 

Trademark

ITC, ICJ and Trademark Datasets in: An Introduction to Empirical Legal Research.

ICJ: This database contains information on individual judge’s votes in the International Court of Justice (n=1,560). There are 103 cases from 1947 to 2003. Information about the nature and outcome of the case as well as a number of measures describing the country of the applicant, respondent, and judge are provided. These data were extracted from a larger dataset compiled by Eric A. Posner and Miguel de Figueiredo and available (along with a codebook) on Eric Voeten’s website.

ITC: This dataset contains information on defendants brought up on charges in front of the International Criminal Tribunal for Rwanda (n=50), the International Criminal Tribunal for Yugoslavia (n=160), and the Special Court for Sierra Leone (n=8) from 1994 to 2010. Data are available on the number of guilty counts, type of guilty counts (war crimes, genocide, crimes against humanity), mitigating and aggravating factors at sentencing, and the length of sentence (both initially and on appeal). These data are excerpted from an ongoing database maintained by James David Meernik. The full database and accompaying codebook (excerpted, as applicable, below) are available at: http://www.psci.unt.edu/ meernik/International%20Criminal%20Tribunals%20Website.htm.

Trademark: These data are from a survey conducted by an expert witness in a trademark case brought in 2007 in the Southern District of New York.1 The Trademark Trial and Appeal Board denied Victoria’s Secret’s application to register the mark SO SEXY for its hair care products based on the objections of Sexy Hair Concepts, LLC. Victoria’s Secret appealed this decision and one of their expert witnesses designed a survey to explore whether the word SEXY had attained a secondary meaning in relation to hair care products. ICJ Dataset: 1560 (Individual judge's votes in ICJ). ICT Dataset size: 218 (Information on defendants at the ICT).

Epstein Lee, Martin Andrew D. (2014): An introduction to empirical legal research. Oxford University Press.

International Law International Court of Justice, Internation Criminal Tribunal for Yugoslavia

The Swedish High Court (SeHC) Package

Github

The package 'sehc' primarily contains a database on the Swedish high courts, more specifically the Supreme Court (“Högsta domstolen” or “HD”) and the Supreme Administrative Court (“Högsta förvaltningsdomstolen”, previously “Regeringsrätten”, or “HFD”). It contains data on both the judgments (2482 cases) of the Supreme Court (presented for example in cases, opinions), as well as on the individual Justices that have served on the Supreme Court and the Supreme Administrative Court (presented in the table appointments). It also contains a number of handy functions for manipulating datasets and combining variables from multiple datasets to conduct common types of analysis.

Lindholm, Johan; Derlén, Mattias; Naurin, Daniel (2023): Swedish High Court Database (version 0.9.1, 1 May 2023). DOI: 10.5281/zenodo.7883860".

All Sweden
Corpus of Resolutions: UN Security Council (CR-UNSC) Zenodo

The Corpus of Resolution: UN Security Council (CR-UNSC) collects and presents for the first time in human and machine-readable form all resolutions, drafts, and meeting records of the UN Security Council, including detailed metadata, as published by the UN Digital Library and revised by the authors.

The current version collects resolutions 1 (1946) through 2722 (2024) in all six official UN languages (English, French, Spanish, Arabic, Chinese, Russian), with drafts and meeting records in English and a massive number of features. 

Fobbe Seán, Gasbarri Lorenzo, Ridi Niccoló: Corpus of Resolutions: UN Security Council (CR-UNSC)

International Law UN Security Council