cityscapes
Stay organized with collections
Save and categorize content based on your preferences.
Warning: Manual download required. See instructions below.
Cityscapes is a dataset consisting of diverse urban street scenes across 50
different cities at varying times of the year as well as ground truths for
several vision tasks including semantic segmentation, instance level
segmentation (TODO), and stereo pair disparity inference.
For segmentation tasks (default split, accessible via
'cityscapes/semantic_segmentation'), Cityscapes provides dense pixel level
annotations for 5000 images at 1024 * 2048 resolution pre-split into training
(2975), validation (500) and test (1525) sets. Label annotations for
segmentation tasks span across 30+ classes commonly encountered during driving
scene perception. Detailed label information may be found here:
https://github.com/mcordts/cityscapesScripts/blob/master/cityscapesscripts/helpers/labels.py#L52-L99
Cityscapes also provides coarse grain segmentation annotations (accessible via
'cityscapes/semantic_segmentation_extra') for 19998 images in a 'train_extra'
split which may prove useful for pretraining / data-heavy models.
Besides segmentation, cityscapes also provides stereo image pairs and ground
truths for disparity inference tasks on both the normal and extra splits
(accessible via 'cityscapes/stereo_disparity' and
'cityscapes/stereo_disparity_extra' respectively).
Ingored examples:
For 'cityscapes/stereo_disparity_extra':
troisdorf_000000000073 {*} images (no disparity map present)
Warning: this dataset requires users to setup a login and password in order to
get the files.
@inproceedings { Cordts2016Cityscapes ,
title = { The Cityscapes Dataset for Semantic Urban Scene Understanding } ,
author = { Cordts , Marius and Omran , Mohamed and Ramos , Sebastian and Rehfeld , Timo and Enzweiler , Markus and Benenson , Rodrigo and Franke , Uwe and Roth , Stefan and Schiele , Bernt } ,
booktitle = { Proc . of the IEEE Conference on Computer Vision and Pattern Recognition ( CVPR ) } ,
year = { 2016 }
}
cityscapes/semantic_segmentation (default config)
Split
Examples
'test'
1,525
'train'
2,975
'validation'
500
FeaturesDict ({
'image_id' : Text ( shape = (), dtype = string ),
'image_left' : Image ( shape = ( 1024 , 2048 , 3 ), dtype = uint8 ),
'segmentation_label' : Image ( shape = ( 1024 , 2048 , 1 ), dtype = uint8 ),
})
Feature
Class
Shape
Dtype
Description
FeaturesDict
image_id
Text
string
image_left
Image
(1024, 2048, 3)
uint8
segmentation_label
Image
(1024, 2048, 1)
uint8
Split
Examples
'train'
2,975
'train_extra'
19,998
'validation'
500
FeaturesDict ({
'image_id' : Text ( shape = (), dtype = string ),
'image_left' : Image ( shape = ( 1024 , 2048 , 3 ), dtype = uint8 ),
'segmentation_label' : Image ( shape = ( 1024 , 2048 , 1 ), dtype = uint8 ),
})
Feature
Class
Shape
Dtype
Description
FeaturesDict
image_id
Text
string
image_left
Image
(1024, 2048, 3)
uint8
segmentation_label
Image
(1024, 2048, 1)
uint8
cityscapes/stereo_disparity
Split
Examples
'test'
1,525
'train'
2,975
'validation'
500
FeaturesDict ({
'disparity_map' : Image ( shape = ( 1024 , 2048 , 1 ), dtype = uint8 ),
'image_id' : Text ( shape = (), dtype = string ),
'image_left' : Image ( shape = ( 1024 , 2048 , 3 ), dtype = uint8 ),
'image_right' : Image ( shape = ( 1024 , 2048 , 3 ), dtype = uint8 ),
})
Feature
Class
Shape
Dtype
Description
FeaturesDict
disparity_map
Image
(1024, 2048, 1)
uint8
image_id
Text
string
image_left
Image
(1024, 2048, 3)
uint8
image_right
Image
(1024, 2048, 3)
uint8
Split
Examples
'train'
2,975
'train_extra'
19,997
'validation'
500
FeaturesDict ({
'disparity_map' : Image ( shape = ( 1024 , 2048 , 1 ), dtype = uint8 ),
'image_id' : Text ( shape = (), dtype = string ),
'image_left' : Image ( shape = ( 1024 , 2048 , 3 ), dtype = uint8 ),
'image_right' : Image ( shape = ( 1024 , 2048 , 3 ), dtype = uint8 ),
})
Feature
Class
Shape
Dtype
Description
FeaturesDict
disparity_map
Image
(1024, 2048, 1)
uint8
image_id
Text
string
image_left
Image
(1024, 2048, 3)
uint8
image_right
Image
(1024, 2048, 3)
uint8
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-12-06 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2022-12-06 UTC."],[],[],null,["# cityscapes\n\n\u003cbr /\u003e\n\n| **Warning:** Manual download required. See instructions below.\n\n- **Description**:\n\nCityscapes is a dataset consisting of diverse urban street scenes across 50\ndifferent cities at varying times of the year as well as ground truths for\nseveral vision tasks including semantic segmentation, instance level\nsegmentation (TODO), and stereo pair disparity inference.\n\nFor segmentation tasks (default split, accessible via\n'cityscapes/semantic_segmentation'), Cityscapes provides dense pixel level\nannotations for 5000 images at 1024 \\* 2048 resolution pre-split into training\n(2975), validation (500) and test (1525) sets. Label annotations for\nsegmentation tasks span across 30+ classes commonly encountered during driving\nscene perception. Detailed label information may be found here:\n\u003chttps://github.com/mcordts/cityscapesScripts/blob/master/cityscapesscripts/helpers/labels.py#L52-L99\u003e\n\nCityscapes also provides coarse grain segmentation annotations (accessible via\n'cityscapes/semantic_segmentation_extra') for 19998 images in a 'train_extra'\nsplit which may prove useful for pretraining / data-heavy models.\n\nBesides segmentation, cityscapes also provides stereo image pairs and ground\ntruths for disparity inference tasks on both the normal and extra splits\n(accessible via 'cityscapes/stereo_disparity' and\n'cityscapes/stereo_disparity_extra' respectively).\n\nIngored examples:\n\n- For 'cityscapes/stereo_disparity_extra':\n - troisdorf_000000*000073*{\\*} images (no disparity map present)\n\n| **Warning:** this dataset requires users to setup a login and password in order to get the files.\n\n- **Additional Documentation** :\n [Explore on Papers With Code\n north_east](https://paperswithcode.com/dataset/cityscapes)\n\n- **Homepage** :\n \u003chttps://www.cityscapes-dataset.com\u003e\n\n- **Source code** :\n [`tfds.datasets.cityscapes.Builder`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/datasets/cityscapes/cityscapes_dataset_builder.py)\n\n- **Versions**:\n\n - **`1.0.0`** (default): No release notes.\n- **Download size** : `Unknown size`\n\n- **Manual download instructions** : This dataset requires you to\n download the source data manually into `download_config.manual_dir`\n (defaults to `~/tensorflow_datasets/downloads/manual/`): \n\n You have to download files from \u003chttps://www.cityscapes-dataset.com/login/\u003e\n (This dataset requires registration).\n For basic config (semantic_segmentation) you must download\n 'leftImg8bit_trainvaltest.zip' and 'gtFine_trainvaltest.zip'.\n Other configs do require additional files - please see code for more details.\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `None`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Citation**:\n\n @inproceedings{Cordts2016Cityscapes,\n title={The Cityscapes Dataset for Semantic Urban Scene Understanding},\n author={Cordts, Marius and Omran, Mohamed and Ramos, Sebastian and Rehfeld, Timo and Enzweiler, Markus and Benenson, Rodrigo and Franke, Uwe and Roth, Stefan and Schiele, Bernt},\n booktitle={Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},\n year={2016}\n }\n\ncityscapes/semantic_segmentation (default config)\n-------------------------------------------------\n\n- **Config description**: Cityscapes semantic segmentation dataset.\n\n- **Dataset size** : `10.86 GiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------------|----------|\n| `'test'` | 1,525 |\n| `'train'` | 2,975 |\n| `'validation'` | 500 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'image_id': Text(shape=(), dtype=string),\n 'image_left': Image(shape=(1024, 2048, 3), dtype=uint8),\n 'segmentation_label': Image(shape=(1024, 2048, 1), dtype=uint8),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|--------------------|--------------|-----------------|--------|-------------|\n| | FeaturesDict | | | |\n| image_id | Text | | string | |\n| image_left | Image | (1024, 2048, 3) | uint8 | |\n| segmentation_label | Image | (1024, 2048, 1) | uint8 | |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\ncityscapes/semantic_segmentation_extra\n--------------------------------------\n\n- **Config description**: Cityscapes semantic segmentation dataset with\n train_extra split and coarse labels.\n\n- **Dataset size** : `51.92 GiB`\n\n- **Splits**:\n\n| Split | Examples |\n|-----------------|----------|\n| `'train'` | 2,975 |\n| `'train_extra'` | 19,998 |\n| `'validation'` | 500 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'image_id': Text(shape=(), dtype=string),\n 'image_left': Image(shape=(1024, 2048, 3), dtype=uint8),\n 'segmentation_label': Image(shape=(1024, 2048, 1), dtype=uint8),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|--------------------|--------------|-----------------|--------|-------------|\n| | FeaturesDict | | | |\n| image_id | Text | | string | |\n| image_left | Image | (1024, 2048, 3) | uint8 | |\n| segmentation_label | Image | (1024, 2048, 1) | uint8 | |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\ncityscapes/stereo_disparity\n---------------------------\n\n- **Config description**: Cityscapes stereo image and disparity maps dataset.\n\n- **Dataset size** : `25.03 GiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------------|----------|\n| `'test'` | 1,525 |\n| `'train'` | 2,975 |\n| `'validation'` | 500 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'disparity_map': Image(shape=(1024, 2048, 1), dtype=uint8),\n 'image_id': Text(shape=(), dtype=string),\n 'image_left': Image(shape=(1024, 2048, 3), dtype=uint8),\n 'image_right': Image(shape=(1024, 2048, 3), dtype=uint8),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|---------------|--------------|-----------------|--------|-------------|\n| | FeaturesDict | | | |\n| disparity_map | Image | (1024, 2048, 1) | uint8 | |\n| image_id | Text | | string | |\n| image_left | Image | (1024, 2048, 3) | uint8 | |\n| image_right | Image | (1024, 2048, 3) | uint8 | |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\ncityscapes/stereo_disparity_extra\n---------------------------------\n\n- **Config description**: Cityscapes stereo image and disparity maps dataset\n with train_extra split.\n\n- **Dataset size** : `119.18 GiB`\n\n- **Splits**:\n\n| Split | Examples |\n|-----------------|----------|\n| `'train'` | 2,975 |\n| `'train_extra'` | 19,997 |\n| `'validation'` | 500 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'disparity_map': Image(shape=(1024, 2048, 1), dtype=uint8),\n 'image_id': Text(shape=(), dtype=string),\n 'image_left': Image(shape=(1024, 2048, 3), dtype=uint8),\n 'image_right': Image(shape=(1024, 2048, 3), dtype=uint8),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|---------------|--------------|-----------------|--------|-------------|\n| | FeaturesDict | | | |\n| disparity_map | Image | (1024, 2048, 1) | uint8 | |\n| image_id | Text | | string | |\n| image_left | Image | (1024, 2048, 3) | uint8 | |\n| image_right | Image | (1024, 2048, 3) | uint8 | |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples..."]]