Stay organized with collections
Save and categorize content based on your preferences.
Description:
This dataset contains ILSVRC-2012 (ImageNet) validation images annotated with
multi-class labels from
"Evaluating Machine Accuracy on ImageNet",
ICML, 2020. The multi-class labels were reviewed by a panel of experts
extensively trained in the intricacies of fine-grained class distinctions in the
ImageNet class hierarchy (see paper for more details). Compared to the original
labels, these expert-reviewed multi-class labels enable a more semantically
coherent evaluation of accuracy.
Only 20,000 of the 50,000 ImageNet validation images have multi-label
annotations. The set of multi-labels was first generated by a testbed of 67
trained ImageNet models, and then each individual model prediction was manually
annotated by the experts as either correct (the label is correct for the
image),wrong (the label is incorrect for the image), or unclear (no
consensus was reached among the experts).
Additionally, during annotation, the expert panel identified a set of
problematic images. An image was problematic if it met any of the below
criteria:
The original ImageNet label (top-1 label) was incorrect or unclear
Image was a drawing, painting, sketch, cartoon, or computer-rendered
Image was excessively edited
Image had inappropriate content
The problematic images are included in this dataset but should be ignored when
computing multi-label accuracy. Additionally, since the initial set of 20,000
annotations is class-balanced, but the set of problematic images is not, we
recommend computing the per-class accuracies and then averaging them. We also
recommend counting a prediction as correct if it is marked as correct or unclear
(i.e., being lenient with the unclear labels).
One possible way of doing this is with the following NumPy code:
importtensorflow_datasetsastfdsds=tfds.load('imagenet2012_multilabel',split='validation')# We assume that predictions is a dictionary from file_name to a class index between 0 and 999num_correct_per_class={}num_images_per_class={}forexampleinds:# We ignore all problematic imagesifexample[‘is_problematic’].numpy():continue# The label of the image in ImageNetcur_class=example['original_label'].numpy()# If we haven't processed this class yet, set the counters to 0ifcur_classnotinnum_correct_per_class:num_correct_per_class[cur_class]=0assertcur_classnotinnum_images_per_classnum_images_per_class[cur_class]=0num_images_per_class[cur_class]+=1# Get the predictions for this imagecur_pred=predictions[example['file_name'].numpy()]# We count a prediction as correct if it is marked as correct or unclear# (i.e., we are lenient with the unclear labels)ifcur_predisinexample['correct_multi_labels'].numpy()orcur_predisinexample['unclear_multi_labels'].numpy():num_correct_per_class[cur_class]+=1# Check that we have collected accuracy data for each of the 1,000 classesnum_classes=1000assertlen(num_correct_per_class)==num_classesassertlen(num_images_per_class)==num_classes# Compute the per-class accuracies and then average themfinal_avg=0forcidinrange(num_classes):assertcidinnum_correct_per_classassertcidinnum_images_per_classfinal_avg+=num_correct_per_class[cid]/num_images_per_class[cid]final_avg/=num_classes
3.0.0 (default): Corrected labels and ImageNet-M split.
Download size: 191.13 MiB
Dataset size: 2.50 GiB
Manual download instructions: This dataset requires you to
download the source data manually into download_config.manual_dir
(defaults to ~/tensorflow_datasets/downloads/manual/):
manual_dir should contain ILSVRC2012_img_val.tar file.
You need to register on http://www.image-net.org/download-images in order
to get the link to download the dataset.
@article{shankar2019evaluating,title={Evaluating Machine Accuracy on ImageNet},author={Vaishaal Shankar*and Rebecca Roelofs*and Horia Mania and Alex Fang and Benjamin Recht and Ludwig Schmidt},journal={ICML},year={2020},note={\url{http://proceedings.mlr.press/v119/shankar20c.html}}}@article{ImageNetChallenge,title={{ImageNet} large scale visual recognition challenge},author={Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause
and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and
Alexander C. Berg and Fei-Fei Li},journal={International Journal of Computer Vision},year={2015},note={\url{https://arxiv.org/abs/1409.0575}}}@inproceedings{ImageNet,author={Jia Deng and Wei Dong and Richard Socher and Li-Jia Li and Kai Li and Li Fei-Fei},booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},title={{ImageNet}: A large-scale hierarchical image database},year={2009},note={\url{http://www.image-net.org/papers/imagenet_cvpr09.pdf}}}@article{vasudevan2022does,title={When does dough become a bagel? Analyzing the remaining mistakes on ImageNet},author={Vasudevan, Vijay and Caine, Benjamin and Gontijo-Lopes, Raphael and Fridovich-Keil, Sara and Roelofs, Rebecca},journal={arXiv preprint arXiv:2205.04596},year={2022}}
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2022-12-10 UTC."],[],[],null,["# imagenet2012_multilabel\n\n\u003cbr /\u003e\n\n| **Warning:** Manual download required. See instructions below.\n\n- **Description**:\n\nThis dataset contains ILSVRC-2012 (ImageNet) validation images annotated with\nmulti-class labels from\n[\"Evaluating Machine Accuracy on ImageNet\"](http://proceedings.mlr.press/v119/shankar20c/shankar20c.pdf),\nICML, 2020. The multi-class labels were reviewed by a panel of experts\nextensively trained in the intricacies of fine-grained class distinctions in the\nImageNet class hierarchy (see paper for more details). Compared to the original\nlabels, these expert-reviewed multi-class labels enable a more semantically\ncoherent evaluation of accuracy.\n\nVersion 3.0.0 of this dataset contains more corrected labels from\n[\"When does dough become a bagel? Analyzing the remaining mistakes on ImageNet](https://arxiv.org/abs/2205.04596)\nas well as the ImageNet-Major (ImageNet-M) 68-example split under 'imagenet-m'.\n\nOnly 20,000 of the 50,000 ImageNet validation images have multi-label\nannotations. The set of multi-labels was first generated by a testbed of 67\ntrained ImageNet models, and then each individual model prediction was manually\nannotated by the experts as either `correct` (the label is correct for the\nimage),`wrong` (the label is incorrect for the image), or `unclear` (no\nconsensus was reached among the experts).\n\nAdditionally, during annotation, the expert panel identified a set of\n*problematic images*. An image was problematic if it met any of the below\ncriteria:\n\n- The original ImageNet label (top-1 label) was incorrect or unclear\n- Image was a drawing, painting, sketch, cartoon, or computer-rendered\n- Image was excessively edited\n- Image had inappropriate content\n\nThe problematic images are included in this dataset but should be ignored when\ncomputing multi-label accuracy. Additionally, since the initial set of 20,000\nannotations is class-balanced, but the set of problematic images is not, we\nrecommend computing the per-class accuracies and then averaging them. We also\nrecommend counting a prediction as correct if it is marked as correct or unclear\n(i.e., being lenient with the unclear labels).\n\nOne possible way of doing this is with the following NumPy code: \n\n import tensorflow_datasets as tfds\n\n ds = tfds.load('imagenet2012_multilabel', split='validation')\n\n # We assume that predictions is a dictionary from file_name to a class index between 0 and 999\n\n num_correct_per_class = {}\n num_images_per_class = {}\n\n for example in ds:\n # We ignore all problematic images\n if example['is_problematic'].numpy():\n continue\n\n # The label of the image in ImageNet\n cur_class = example['original_label'].numpy()\n\n # If we haven't processed this class yet, set the counters to 0\n if cur_class not in num_correct_per_class:\n num_correct_per_class[cur_class] = 0\n assert cur_class not in num_images_per_class\n num_images_per_class[cur_class] = 0\n\n num_images_per_class[cur_class] += 1\n\n # Get the predictions for this image\n cur_pred = predictions[example['file_name'].numpy()]\n\n # We count a prediction as correct if it is marked as correct or unclear\n # (i.e., we are lenient with the unclear labels)\n if cur_pred is in example['correct_multi_labels'].numpy() or cur_pred is in example['unclear_multi_labels'].numpy():\n num_correct_per_class[cur_class] += 1\n\n # Check that we have collected accuracy data for each of the 1,000 classes\n num_classes = 1000\n assert len(num_correct_per_class) == num_classes\n assert len(num_images_per_class) == num_classes\n\n # Compute the per-class accuracies and then average them\n final_avg = 0\n for cid in range(num_classes):\n assert cid in num_correct_per_class\n assert cid in num_images_per_class\n final_avg += num_correct_per_class[cid] / num_images_per_class[cid]\n final_avg /= num_classes\n\n- **Homepage** :\n \u003chttps://github.com/modestyachts/evaluating_machine_accuracy_on_imagenet\u003e\n\n- **Source code** :\n [`tfds.datasets.imagenet2012_multilabel.Builder`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/datasets/imagenet2012_multilabel/imagenet2012_multilabel_dataset_builder.py)\n\n- **Versions**:\n\n - `1.0.0`: Initial release.\n - `2.0.0`: Fixed ILSVRC2012_img_val.tar file.\n - **`3.0.0`** (default): Corrected labels and ImageNet-M split.\n- **Download size** : `191.13 MiB`\n\n- **Dataset size** : `2.50 GiB`\n\n- **Manual download instructions** : This dataset requires you to\n download the source data manually into `download_config.manual_dir`\n (defaults to `~/tensorflow_datasets/downloads/manual/`): \n\n manual_dir should contain `ILSVRC2012_img_val.tar` file.\n You need to register on \u003chttp://www.image-net.org/download-images\u003e in order\n to get the link to download the dataset.\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Splits**:\n\n| Split | Examples |\n|----------------|----------|\n| `'imagenet_m'` | 68 |\n| `'validation'` | 20,000 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'correct_multi_labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=1000)),\n 'file_name': Text(shape=(), dtype=string),\n 'image': Image(shape=(None, None, 3), dtype=uint8),\n 'is_problematic': bool,\n 'original_label': ClassLabel(shape=(), dtype=int64, num_classes=1000),\n 'unclear_multi_labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=1000)),\n 'wrong_multi_labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=1000)),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|----------------------|----------------------|-----------------|--------|-------------|\n| | FeaturesDict | | | |\n| correct_multi_labels | Sequence(ClassLabel) | (None,) | int64 | |\n| file_name | Text | | string | |\n| image | Image | (None, None, 3) | uint8 | |\n| is_problematic | Tensor | | bool | |\n| original_label | ClassLabel | | int64 | |\n| unclear_multi_labels | Sequence(ClassLabel) | (None,) | int64 | |\n| wrong_multi_labels | Sequence(ClassLabel) | (None,) | int64 | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `('image', 'correct_multi_labels')`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @article{shankar2019evaluating,\n title={Evaluating Machine Accuracy on ImageNet},\n author={Vaishaal Shankar* and Rebecca Roelofs* and Horia Mania and Alex Fang and Benjamin Recht and Ludwig Schmidt},\n journal={ICML},\n year={2020},\n note={\\url{http://proceedings.mlr.press/v119/shankar20c.html} }\n }\n @article{ImageNetChallenge,\n title={ {ImageNet} large scale visual recognition challenge},\n author={Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause\n and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and\n Alexander C. Berg and Fei-Fei Li},\n journal={International Journal of Computer Vision},\n year={2015},\n note={\\url{https://arxiv.org/abs/1409.0575} }\n }\n @inproceedings{ImageNet,\n author={Jia Deng and Wei Dong and Richard Socher and Li-Jia Li and Kai Li and Li Fei-Fei},\n booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},\n title={ {ImageNet}: A large-scale hierarchical image database},\n year={2009},\n note={\\url{http://www.image-net.org/papers/imagenet_cvpr09.pdf} }\n }\n @article{vasudevan2022does,\n title={When does dough become a bagel? Analyzing the remaining mistakes on ImageNet},\n author={Vasudevan, Vijay and Caine, Benjamin and Gontijo-Lopes, Raphael and Fridovich-Keil, Sara and Roelofs, Rebecca},\n journal={arXiv preprint arXiv:2205.04596},\n year={2022}\n }"]]