CLEVRER-Humans: Describing physical and causal events the human way

Mao, Jiayuan1 ; Yang, Xuelin 2 ; Zhang, Xikun2 ; Goodman, Noah2 ; Wu, Jiajun2

Published Oct 25, 2022; Updated May 12, 2023 on Dryad. https://doi.org/10.5061/dryad.5tb2rbp7c

Data files

Oct 25, 2022 version files 2.25 MB

causal_cloze.json

253.37 KB
README.md

2.21 KB
train_ceg_data.p

621.90 KB
train_question.json

147.55 KB
val_ceg_data.p

1.14 MB
val_question.json

90.16 KB

May 12, 2023 version files 3.05 MB

causal_cloze.json

253.37 KB
README.md

2.29 KB
train_ceg_data.p

1.23 MB
train_question.json

310.64 KB
valid_ceg_data.p

1.18 MB
valid_question.json

77.73 KB

Abstract

Building machines that can reason about physical events and their causal relationships is crucial for flexible interaction with the physical world. However, most existing physical and causal reasoning benchmarks are exclusively based on synthetically generated events and synthetic natural language descriptions of causal relationships. This design brings up two issues. First, there is a lack of diversity in both event types and natural language descriptions; second, causal relationships based on manually-defined heuristics are different from human judgments. To address both shortcomings, we present the CLEVRER-Humans benchmark, a video reasoning dataset for causal judgment of physical events with human labels. We employ two techniques to improve data collection efficiency: first, a novel iterative event cloze task to elicit a new representation of events in videos, which we term Causal Event Graphs (CEGs); second, a data augmentation technique based on neural language generative models. We convert the collected CEGs into questions and answers to be consistent with prior work. Finally, we study a collection of baseline approaches for CLEVRER-Humans question-answering, highlighting the great challenges set forth by our benchmark.

CLEVRER-Humans: Describing physical and causal events the human way

Data files

Abstract

Methods