- Description:
The HellaSwag dataset is a benchmark for Commonsense NLI. It includes a context and some endings which complete the context.
Additional Documentation: Explore on Papers With Code
Homepage: https://rowanzellers.com/hellaswag/
Source code:
tfds.text.HellaswagVersions:
0.0.1: No release notes.1.0.0: Adding separate splits for in-domain and out-of-domain validation/test sets.1.1.0(default): Another split dimension for source (wikihow vs activitynet)
Download size:
68.18 MiBDataset size:
107.45 MiBAuto-cached (documentation): Yes
Splits:
| Split | Examples |
|---|---|
'test' |
10,003 |
'test_ind_activitynet' |
1,870 |
'test_ind_wikihow' |
3,132 |
'test_ood_activitynet' |
1,651 |
'test_ood_wikihow' |
3,350 |
'train' |
39,905 |
'train_activitynet' |
14,740 |
'train_wikihow' |
25,165 |
'validation' |
10,042 |
'validation_ind_activitynet' |
1,809 |
'validation_ind_wikihow' |
3,192 |
'validation_ood_activitynet' |
1,434 |
'validation_ood_wikihow' |
3,607 |
- Feature structure:
FeaturesDict({
'activity_label': Text(shape=(), dtype=string),
'context': Text(shape=(), dtype=string),
'endings': Sequence(Text(shape=(), dtype=string)),
'label': int32,
'source_id': Text(shape=(), dtype=string),
'split_type': Text(shape=(), dtype=string),
})
- Feature documentation:
| Feature | Class | Shape | Dtype | Description |
|---|---|---|---|---|
| FeaturesDict | ||||
| activity_label | Text | string | ||
| context | Text | string | ||
| endings | Sequence(Text) | (None,) | string | |
| label | Tensor | int32 | ||
| source_id | Text | string | ||
| split_type | Text | string |
Supervised keys (See
as_superviseddoc):NoneFigure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):
- Citation:
@inproceedings{zellers2019hellaswag,
title={HellaSwag: Can a Machine Really Finish Your Sentence?},
author={Zellers, Rowan and Holtzman, Ari and Bisk, Yonatan and Farhadi, Ali and Choi, Yejin},
booktitle ={Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
year={2019}
}