CompERBench: A collection of 21 complete benchmark tasks for entity matching. CompERBench: Complementing Entity Matching Benchmark Tasks

DOI

Entity Matching is the task of determining which records from different data sources describe the same real-world entity. It is an important task for data integration and has been the focus of many research works. A large amount of entity matching tasks for benchmarking have been developed and made publicly available for evaluating, comparing, reproducing and showing the strengths of different matching methods. However, the lack of fixed development and test sets, correspondence sets including both matching and non-matching record pairs as well as baseline results, hinders reproducibility and comparability. In an effort to enhance the reproducibility and comparability of matching methods, we complement existing benchmark tasks for entity matching with fixed development and test sets. We provide 21 complete benchmark tasks for entity matching for public download. The selected tasks are highly diverse and include data sets of different sizes, amounts of attributes, density, attribute data types as well as number of sources from which the originate.

Identifier
DOI https://doi.org/10.7801/348
Related Identifier References https://doi.org/10.1145/3340531.3412781
Related Identifier IsDocumentedBy https://madoc.bib.uni-mannheim.de/id/eprint/57249
Metadata Access https://api.datacite.org/dois/10.7801/348
Provenance
Creator Primpeli, Anna; Bizer, Christian
Publisher Mannheim University Library
Publication Year 2020
OpenAccess true
Representation
Resource Type Dataset
Format application/zip
Size 132649259
Version 1
Discipline Social Sciences