cifar10_1
Stay organized with collections
Save and categorize content based on your preferences.
The CIFAR-10.1 dataset is a new test set for CIFAR-10. CIFAR-10.1 contains
roughly 2,000 new test images that were sampled after multiple years of research
on the original CIFAR-10 dataset. The data collection for CIFAR-10.1 was
designed to minimize distribution shift relative to the original dataset. We
describe the creation of CIFAR-10.1 in the paper "Do CIFAR-10 Classifiers
Generalize to CIFAR-10?". The images in CIFAR-10.1 are a subset of the
TinyImages dataset. There are currently two versions of the CIFAR-10.1 dataset:
v4 and v6.
FeaturesDict({
'image': Image(shape=(32, 32, 3), dtype=uint8),
'label': ClassLabel(shape=(), dtype=int64, num_classes=10),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
image |
Image |
(32, 32, 3) |
uint8 |
|
label |
ClassLabel |
|
int64 |
|
@article{recht2018cifar10.1,
author = {Benjamin Recht and Rebecca Roelofs and Ludwig Schmidt and Vaishaal Shankar},
title = {Do CIFAR-10 Classifiers Generalize to CIFAR-10?},
year = {2018},
note = {\url{https://arxiv.org/abs/1806.00451} },
}
@article{torralba2008tinyimages,
author = {Antonio Torralba and Rob Fergus and William T. Freeman},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
title = {80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition},
year = {2008},
volume = {30},
number = {11},
pages = {1958-1970}
}
cifar10_1/v4 (default config)
Config description: It is the first version of our dataset on which we
tested any classifier. As mentioned above, this makes the v4 dataset
independent of the classifiers we evaluate. The numbers reported in the main
sections of our paper use this version of the dataset. It was built from the
top 25 TinyImages keywords for each class, which led to a slight class
imbalance. The largest difference is that ships make up only 8% of the test
set instead of 10%. v4 contains 2,021 images.
Download size: 5.93 MiB
Dataset size: 4.46 MiB
Splits:
Split |
Examples |
'test' |
2,021 |

cifar10_1/v6
Split |
Examples |
'test' |
2,000 |

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-06-01 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-06-01 UTC."],[],[],null,["# cifar10_1\n\n\u003cbr /\u003e\n\n- **Description**:\n\nThe CIFAR-10.1 dataset is a new test set for CIFAR-10. CIFAR-10.1 contains\nroughly 2,000 new test images that were sampled after multiple years of research\non the original CIFAR-10 dataset. The data collection for CIFAR-10.1 was\ndesigned to minimize distribution shift relative to the original dataset. We\ndescribe the creation of CIFAR-10.1 in the paper \"Do CIFAR-10 Classifiers\nGeneralize to CIFAR-10?\". The images in CIFAR-10.1 are a subset of the\nTinyImages dataset. There are currently two versions of the CIFAR-10.1 dataset:\nv4 and v6.\n\n- **Homepage** :\n \u003chttps://github.com/modestyachts/CIFAR-10.1\u003e\n\n- **Source code** :\n [`tfds.image_classification.Cifar10_1`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/image_classification/cifar10_1.py)\n\n- **Versions**:\n\n - **`1.1.0`** (default): No release notes.\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n Yes\n\n- **Feature structure**:\n\n FeaturesDict({\n 'image': Image(shape=(32, 32, 3), dtype=uint8),\n 'label': ClassLabel(shape=(), dtype=int64, num_classes=10),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|---------|--------------|-------------|-------|-------------|\n| | FeaturesDict | | | |\n| image | Image | (32, 32, 3) | uint8 | |\n| label | ClassLabel | | int64 | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `('image', 'label')`\n\n- **Citation**:\n\n @article{recht2018cifar10.1,\n author = {Benjamin Recht and Rebecca Roelofs and Ludwig Schmidt and Vaishaal Shankar},\n title = {Do CIFAR-10 Classifiers Generalize to CIFAR-10?},\n year = {2018},\n note = {\\url{https://arxiv.org/abs/1806.00451} },\n }\n\n @article{torralba2008tinyimages,\n author = {Antonio Torralba and Rob Fergus and William T. Freeman},\n journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},\n title = {80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition},\n year = {2008},\n volume = {30},\n number = {11},\n pages = {1958-1970}\n }\n\ncifar10_1/v4 (default config)\n-----------------------------\n\n- **Config description**: It is the first version of our dataset on which we\n tested any classifier. As mentioned above, this makes the v4 dataset\n independent of the classifiers we evaluate. The numbers reported in the main\n sections of our paper use this version of the dataset. It was built from the\n top 25 TinyImages keywords for each class, which led to a slight class\n imbalance. The largest difference is that ships make up only 8% of the test\n set instead of 10%. v4 contains 2,021 images.\n\n- **Download size** : `5.93 MiB`\n\n- **Dataset size** : `4.46 MiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------|----------|\n| `'test'` | 2,021 |\n\n- **Figure** ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\ncifar10_1/v6\n------------\n\n- **Config description**: It is derived from a slightly improved keyword\n allocation that is exactly class balanced. This version of the dataset\n corresponds to the results in Appendix D of our paper. v6 contains 2,000\n images.\n\n- **Download size** : `5.87 MiB`\n\n- **Dataset size** : `4.40 MiB`\n\n- **Splits**:\n\n| Split | Examples |\n|----------|----------|\n| `'test'` | 2,000 |\n\n- **Figure** ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples..."]]