Skip to main content
Dryad logo

Linkage of hospital records and death certificates by a search engine and machine learning: training and test set data

Citation

Cossin, Sebastien (2022), Linkage of hospital records and death certificates by a search engine and machine learning: training and test set data, Dryad, Dataset, https://doi.org/10.5061/dryad.1ns1rn8sj

Abstract

INTRODUCTION: Vital status is of central importance to hospital clinical research. However, hospital information systems record only in-hospital death information. Recently, the French government released a publicly available dataset containing death-certificate data for over 25 million individuals. The objective of this study was to link French death certificates to the Bordeaux University Hospital records to complete the vital status information.

MATERIALS AND METHODS: Our linkage strategy was composed of a search engine to reduce the number of comparisons and machine-learning algorithms. The overall pipeline was evaluated by assembling a file containing 3,565 in-hospital deaths and 15,000 alive persons.

RESULTS: The recall and precision of our linkage strategy were 97.5% and 99.97% for the upper threshold and 99.4% and 98.9% for the lower threshold, respectively.

CONCLUSION: In this article, we demonstrated the feasibility of accurately linking hospital records with death certificates using a search engine and machine learning.

Usage Notes

README file included