gref
Stay organized with collections
Save and categorize content based on your preferences.
Warning: Manual download required. See instructions below.
The Google RefExp dataset is a collection of text descriptions of objects in
images which builds on the publicly available MS-COCO dataset. Whereas the image
captions in MS-COCO apply to the entire image, this dataset focuses on text
descriptions that allow one to uniquely identify a single object or region
within an image. See more details in this paper: Generation and Comprehension of
Unambiguous Object Descriptions.
The coco_train2014 folder contains all of COCO 2014 training images.
Split
Examples
'train'
24,698
'validation'
4,650
FeaturesDict ({
'image' : Image ( shape = ( None , None , 3 ), dtype = uint8 ),
'image/id' : int64 ,
'objects' : Sequence ({
'area' : int64 ,
'bbox' : BBoxFeature ( shape = ( 4 ,), dtype = float32 ),
'id' : int64 ,
'label' : int64 ,
'label_name' : ClassLabel ( shape = (), dtype = int64 , num_classes = 80 ),
'refexp' : Sequence ({
'raw' : Text ( shape = (), dtype = string ),
'referent' : Text ( shape = (), dtype = string ),
'refexp_id' : int64 ,
'tokens' : Sequence ( Text ( shape = (), dtype = string )),
}),
}),
})
Feature
Class
Shape
Dtype
Description
FeaturesDict
image
Image
(None,
None,
3)
uint8
image/id
Tensor
int64
objects
Sequence
objects/area
Tensor
int64
objects/bbox
BBoxFeature
(4,)
float32
objects/id
Tensor
int64
objects/label
Tensor
int64
objects/label_name
ClassLabel
int64
objects/refexp
Sequence
objects/refexp/raw
Text
string
objects/refexp/referent
Text
string
objects/refexp/refexp_id
Tensor
int64
objects/refexp/tokens
Sequence(Text)
(None,)
string
@inproceedings { mao2016generation ,
title = { Generation and Comprehension of Unambiguous Object Descriptions } ,
author = { Mao , Junhua and Huang , Jonathan and Toshev , Alexander and Camburu , Oana and Yuille , Alan and Murphy , Kevin } ,
booktitle = { CVPR } ,
year = { 2016 }
}
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-12-06 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2022-12-06 UTC."],[],[],null,["# gref\n\n\u003cbr /\u003e\n\n| **Warning:** Manual download required. See instructions below.\n\n- **Description**:\n\nThe Google RefExp dataset is a collection of text descriptions of objects in\nimages which builds on the publicly available MS-COCO dataset. Whereas the image\ncaptions in MS-COCO apply to the entire image, this dataset focuses on text\ndescriptions that allow one to uniquely identify a single object or region\nwithin an image. See more details in this paper: Generation and Comprehension of\nUnambiguous Object Descriptions.\n\n- **Additional Documentation** :\n [Explore on Papers With Code\n north_east](https://paperswithcode.com/dataset/google-refexp)\n\n- **Homepage** :\n \u003chttps://github.com/mjhucla/Google_Refexp_toolbox\u003e\n\n- **Source code** :\n [`tfds.vision_language.gref.Gref`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/vision_language/gref/gref.py)\n\n- **Versions**:\n\n - **`1.0.0`** (default): Initial release.\n- **Download size** : `Unknown size`\n\n- **Dataset size** : `4.60 GiB`\n\n- **Manual download instructions** : This dataset requires you to\n download the source data manually into `download_config.manual_dir`\n (defaults to `~/tensorflow_datasets/downloads/manual/`): \n\n Follow instructions at \u003chttps://github.com/mjhucla/Google_Refexp_toolbox\u003e\n to download and pre-process the data into aligned format with COCO.\n The directory contains 2 files and one folder:\n\n- google_refexp_train_201511_coco_aligned_catg.json\n\n- google_refexp_val_201511_coco_aligned_catg.json\n\n- coco_train2014/\n\nThe coco_train2014 folder contains all of COCO 2014 training images.\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Splits**:\n\n| Split | Examples |\n|----------------|----------|\n| `'train'` | 24,698 |\n| `'validation'` | 4,650 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'image': Image(shape=(None, None, 3), dtype=uint8),\n 'image/id': int64,\n 'objects': Sequence({\n 'area': int64,\n 'bbox': BBoxFeature(shape=(4,), dtype=float32),\n 'id': int64,\n 'label': int64,\n 'label_name': ClassLabel(shape=(), dtype=int64, num_classes=80),\n 'refexp': Sequence({\n 'raw': Text(shape=(), dtype=string),\n 'referent': Text(shape=(), dtype=string),\n 'refexp_id': int64,\n 'tokens': Sequence(Text(shape=(), dtype=string)),\n }),\n }),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|--------------------------|----------------|-----------------|---------|-------------|\n| | FeaturesDict | | | |\n| image | Image | (None, None, 3) | uint8 | |\n| image/id | Tensor | | int64 | |\n| objects | Sequence | | | |\n| objects/area | Tensor | | int64 | |\n| objects/bbox | BBoxFeature | (4,) | float32 | |\n| objects/id | Tensor | | int64 | |\n| objects/label | Tensor | | int64 | |\n| objects/label_name | ClassLabel | | int64 | |\n| objects/refexp | Sequence | | | |\n| objects/refexp/raw | Text | | string | |\n| objects/refexp/referent | Text | | string | |\n| objects/refexp/refexp_id | Tensor | | int64 | |\n| objects/refexp/tokens | Sequence(Text) | (None,) | string | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `None`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @inproceedings{mao2016generation,\n title={Generation and Comprehension of Unambiguous Object Descriptions},\n author={Mao, Junhua and Huang, Jonathan and Toshev, Alexander and Camburu, Oana and Yuille, Alan and Murphy, Kevin},\n booktitle={CVPR},\n year={2016}\n }"]]