CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

We present CoDEx, a set of knowledge graph Completion Datasets Extracted from Wikidata and Wikipedia that improve upon existing knowledge graph completion benchmarks in scope and level of difficulty. In terms of scope, CoDEx comprises three knowledge graphs varying in size and structure, multilingual descriptions of entities and relations, and tens of thousands of hard negative triples that are plausible but verified to be false...

To characterize CoDEx, we contribute thorough empirical analyses and benchmarking experiments. First, we analyze each CoDEx dataset in terms of logical relation patterns. Next, we report baseline link prediction and triple classification results on CoDEx for five extensively tuned embedding models. Finally, we differentiate CoDEx from a popular link prediction benchmark by showing that CoDEx covers more diverse and interpretable content, and contains fewer relation patterns that can be covered by trivial frequency-based rules. Data, code, and pretrained models are available at https://github.com/tsafavi/codex.

(read more)



PDF



Abstract

Visit source site

Leave a Reply