grounded_scan
Stay organized with collections
Save and categorize content based on your preferences.
Grounded SCAN (gSCAN) is a synthetic dataset for evaluating compositional
generalization in situated language understanding. gSCAN pairs natural language
instructions with action sequences, and requires the agent to interpret
instructions within the context of a grid-based visual navigation environment.
More information can be found at:
FeaturesDict({
'command': Sequence(Text(shape=(), dtype=string)),
'manner': Text(shape=(), dtype=string),
'meaning': Sequence(Text(shape=(), dtype=string)),
'referred_target': Text(shape=(), dtype=string),
'situation': FeaturesDict({
'agent_direction': int32,
'agent_position': FeaturesDict({
'column': int32,
'row': int32,
}),
'direction_to_target': Text(shape=(), dtype=string),
'distance_to_target': int32,
'grid_size': int32,
'placed_objects': Sequence({
'object': FeaturesDict({
'color': Text(shape=(), dtype=string),
'shape': Text(shape=(), dtype=string),
'size': int32,
}),
'position': FeaturesDict({
'column': int32,
'row': int32,
}),
'vector': Text(shape=(), dtype=string),
}),
'target_object': FeaturesDict({
'object': FeaturesDict({
'color': Text(shape=(), dtype=string),
'shape': Text(shape=(), dtype=string),
'size': int32,
}),
'position': FeaturesDict({
'column': int32,
'row': int32,
}),
'vector': Text(shape=(), dtype=string),
}),
}),
'target_commands': Sequence(Text(shape=(), dtype=string)),
'verb_in_command': Text(shape=(), dtype=string),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
command |
Sequence(Text) |
(None,) |
string |
|
manner |
Text |
|
string |
|
meaning |
Sequence(Text) |
(None,) |
string |
|
referred_target |
Text |
|
string |
|
situation |
FeaturesDict |
|
|
|
situation/agent_direction |
Tensor |
|
int32 |
|
situation/agent_position |
FeaturesDict |
|
|
|
situation/agent_position/column |
Tensor |
|
int32 |
|
situation/agent_position/row |
Tensor |
|
int32 |
|
situation/direction_to_target |
Text |
|
string |
|
situation/distance_to_target |
Tensor |
|
int32 |
|
situation/grid_size |
Tensor |
|
int32 |
|
situation/placed_objects |
Sequence |
|
|
|
situation/placed_objects/object |
FeaturesDict |
|
|
|
situation/placed_objects/object/color |
Text |
|
string |
|
situation/placed_objects/object/shape |
Text |
|
string |
|
situation/placed_objects/object/size |
Tensor |
|
int32 |
|
situation/placed_objects/position |
FeaturesDict |
|
|
|
situation/placed_objects/position/column |
Tensor |
|
int32 |
|
situation/placed_objects/position/row |
Tensor |
|
int32 |
|
situation/placed_objects/vector |
Text |
|
string |
|
situation/target_object |
FeaturesDict |
|
|
|
situation/target_object/object |
FeaturesDict |
|
|
|
situation/target_object/object/color |
Text |
|
string |
|
situation/target_object/object/shape |
Text |
|
string |
|
situation/target_object/object/size |
Tensor |
|
int32 |
|
situation/target_object/position |
FeaturesDict |
|
|
|
situation/target_object/position/column |
Tensor |
|
int32 |
|
situation/target_object/position/row |
Tensor |
|
int32 |
|
situation/target_object/vector |
Text |
|
string |
|
target_commands |
Sequence(Text) |
(None,) |
string |
|
verb_in_command |
Text |
|
string |
|
@inproceedings{NEURIPS2020_e5a90182,
author = {Ruis, Laura and Andreas, Jacob and Baroni, Marco and Bouchacourt, Diane and Lake, Brenden M},
booktitle = {Advances in Neural Information Processing Systems},
editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
pages = {19861--19872},
publisher = {Curran Associates, Inc.},
title = {A Benchmark for Systematic Generalization in Grounded Language Understanding},
url = {https://proceedings.neurips.cc/paper/2020/file/e5a90182cc81e12ab5e72d66e0b46fe3-Paper.pdf},
volume = {33},
year = {2020}
}
@inproceedings{qiu-etal-2021-systematic,
title = "Systematic Generalization on g{SCAN}: {W}hat is Nearly Solved and What is Next?",
author = "Qiu, Linlu and
Hu, Hexiang and
Zhang, Bowen and
Shaw, Peter and
Sha, Fei",
booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2021",
address = "Online and Punta Cana, Dominican Republic",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.emnlp-main.166",
doi = "10.18653/v1/2021.emnlp-main.166",
pages = "2180--2188",
}
grounded_scan/compositional_splits (default config)
Split |
Examples |
'adverb_1' |
112,880 |
'adverb_2' |
38,582 |
'contextual' |
11,460 |
'dev' |
3,716 |
'situational_1' |
88,642 |
'situational_2' |
16,808 |
'test' |
19,282 |
'train' |
367,933 |
'visual' |
37,436 |
'visual_easier' |
18,718 |
grounded_scan/target_length_split
Split |
Examples |
'dev' |
1,821 |
'target_lengths' |
198,588 |
'test' |
37,784 |
'train' |
180,301 |
grounded_scan/spatial_relation_splits
Split |
Examples |
'dev' |
2,617 |
'referent' |
30,492 |
'relation' |
6,285 |
'relative_position_1' |
41,576 |
'relative_position_2' |
41,529 |
'test' |
28,526 |
'train' |
259,088 |
'visual' |
62,250 |
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-12-06 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2022-12-06 UTC."],[],[],null,["# grounded_scan\n\n\u003cbr /\u003e\n\n- **Description**:\n\nGrounded SCAN (gSCAN) is a synthetic dataset for evaluating compositional\ngeneralization in situated language understanding. gSCAN pairs natural language\ninstructions with action sequences, and requires the agent to interpret\ninstructions within the context of a grid-based visual navigation environment.\n\nMore information can be found at:\n\n- For the `compositional_splits` and the `target_length_split`:\n \u003chttps://github.com/LauraRuis/groundedSCAN\u003e\n\n- For the `spatial_relation_splits`:\n \u003chttps://github.com/google-research/language/tree/master/language/gscan/data\u003e\n\n- **Homepage** :\n \u003chttps://github.com/LauraRuis/groundedSCAN\u003e\n\n- **Source code** :\n [`tfds.vision_language.grounded_scan.GroundedScan`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/vision_language/grounded_scan/grounded_scan.py)\n\n- **Versions**:\n\n - `1.0.0`: Initial release.\n - `1.1.0`: Changed `vector` feature to Text().\n - **`2.0.0`** (default): Adds the new spatial_relation_splits config.\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Feature structure**:\n\n FeaturesDict({\n 'command': Sequence(Text(shape=(), dtype=string)),\n 'manner': Text(shape=(), dtype=string),\n 'meaning': Sequence(Text(shape=(), dtype=string)),\n 'referred_target': Text(shape=(), dtype=string),\n 'situation': FeaturesDict({\n 'agent_direction': int32,\n 'agent_position': FeaturesDict({\n 'column': int32,\n 'row': int32,\n }),\n 'direction_to_target': Text(shape=(), dtype=string),\n 'distance_to_target': int32,\n 'grid_size': int32,\n 'placed_objects': Sequence({\n 'object': FeaturesDict({\n 'color': Text(shape=(), dtype=string),\n 'shape': Text(shape=(), dtype=string),\n 'size': int32,\n }),\n 'position': FeaturesDict({\n 'column': int32,\n 'row': int32,\n }),\n 'vector': Text(shape=(), dtype=string),\n }),\n 'target_object': FeaturesDict({\n 'object': FeaturesDict({\n 'color': Text(shape=(), dtype=string),\n 'shape': Text(shape=(), dtype=string),\n 'size': int32,\n }),\n 'position': FeaturesDict({\n 'column': int32,\n 'row': int32,\n }),\n 'vector': Text(shape=(), dtype=string),\n }),\n }),\n 'target_commands': Sequence(Text(shape=(), dtype=string)),\n 'verb_in_command': Text(shape=(), dtype=string),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|------------------------------------------|----------------|---------|--------|-------------|\n| | FeaturesDict | | | |\n| command | Sequence(Text) | (None,) | string | |\n| manner | Text | | string | |\n| meaning | Sequence(Text) | (None,) | string | |\n| referred_target | Text | | string | |\n| situation | FeaturesDict | | | |\n| situation/agent_direction | Tensor | | int32 | |\n| situation/agent_position | FeaturesDict | | | |\n| situation/agent_position/column | Tensor | | int32 | |\n| situation/agent_position/row | Tensor | | int32 | |\n| situation/direction_to_target | Text | | string | |\n| situation/distance_to_target | Tensor | | int32 | |\n| situation/grid_size | Tensor | | int32 | |\n| situation/placed_objects | Sequence | | | |\n| situation/placed_objects/object | FeaturesDict | | | |\n| situation/placed_objects/object/color | Text | | string | |\n| situation/placed_objects/object/shape | Text | | string | |\n| situation/placed_objects/object/size | Tensor | | int32 | |\n| situation/placed_objects/position | FeaturesDict | | | |\n| situation/placed_objects/position/column | Tensor | | int32 | |\n| situation/placed_objects/position/row | Tensor | | int32 | |\n| situation/placed_objects/vector | Text | | string | |\n| situation/target_object | FeaturesDict | | | |\n| situation/target_object/object | FeaturesDict | | | |\n| situation/target_object/object/color | Text | | string | |\n| situation/target_object/object/shape | Text | | string | |\n| situation/target_object/object/size | Tensor | | int32 | |\n| situation/target_object/position | FeaturesDict | | | |\n| situation/target_object/position/column | Tensor | | int32 | |\n| situation/target_object/position/row | Tensor | | int32 | |\n| situation/target_object/vector | Text | | string | |\n| target_commands | Sequence(Text) | (None,) | string | |\n| verb_in_command | Text | | string | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `None`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Citation**:\n\n @inproceedings{NEURIPS2020_e5a90182,\n author = {Ruis, Laura and Andreas, Jacob and Baroni, Marco and Bouchacourt, Diane and Lake, Brenden M},\n booktitle = {Advances in Neural Information Processing Systems},\n editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},\n pages = {19861--19872},\n publisher = {Curran Associates, Inc.},\n title = {A Benchmark for Systematic Generalization in Grounded Language Understanding},\n url = {https://proceedings.neurips.cc/paper/2020/file/e5a90182cc81e12ab5e72d66e0b46fe3-Paper.pdf},\n volume = {33},\n year = {2020}\n }\n\n @inproceedings{qiu-etal-2021-systematic,\n title = \"Systematic Generalization on g{SCAN}: {W}hat is Nearly Solved and What is Next?\",\n author = \"Qiu, Linlu and\n Hu, Hexiang and\n Zhang, Bowen and\n Shaw, Peter and\n Sha, Fei\",\n booktitle = \"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing\",\n month = nov,\n year = \"2021\",\n address = \"Online and Punta Cana, Dominican Republic\",\n publisher = \"Association for Computational Linguistics\",\n url = \"https://aclanthology.org/2021.emnlp-main.166\",\n doi = \"10.18653/v1/2021.emnlp-main.166\",\n pages = \"2180--2188\",\n }\n\ngrounded_scan/compositional_splits (default config)\n---------------------------------------------------\n\n- **Config description**: Examples for compositional generalization.\n\n- **Download size** : `82.10 MiB`\n\n- **Dataset size** : `998.11 MiB`\n\n- **Splits**:\n\n| Split | Examples |\n|-------------------|----------|\n| `'adverb_1'` | 112,880 |\n| `'adverb_2'` | 38,582 |\n| `'contextual'` | 11,460 |\n| `'dev'` | 3,716 |\n| `'situational_1'` | 88,642 |\n| `'situational_2'` | 16,808 |\n| `'test'` | 19,282 |\n| `'train'` | 367,933 |\n| `'visual'` | 37,436 |\n| `'visual_easier'` | 18,718 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\ngrounded_scan/target_length_split\n---------------------------------\n\n- **Config description**: Examples for generalizing to larger target lengths.\n\n- **Download size** : `53.41 MiB`\n\n- **Dataset size** : `546.73 MiB`\n\n- **Splits**:\n\n| Split | Examples |\n|--------------------|----------|\n| `'dev'` | 1,821 |\n| `'target_lengths'` | 198,588 |\n| `'test'` | 37,784 |\n| `'train'` | 180,301 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\ngrounded_scan/spatial_relation_splits\n-------------------------------------\n\n- **Config description**: Examples for spatial relation reasoning.\n\n- **Download size** : `89.59 MiB`\n\n- **Dataset size** : `675.09 MiB`\n\n- **Splits**:\n\n| Split | Examples |\n|-------------------------|----------|\n| `'dev'` | 2,617 |\n| `'referent'` | 30,492 |\n| `'relation'` | 6,285 |\n| `'relative_position_1'` | 41,576 |\n| `'relative_position_2'` | 41,529 |\n| `'test'` | 28,526 |\n| `'train'` | 259,088 |\n| `'visual'` | 62,250 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples..."]]