s3o4d
Stay organized with collections
Save and categorize content based on your preferences.
The dataset first described in the "Stanford 3D Objects" section of the paper
Disentangling by Subspace Diffusion. The
data consists of 100,000 renderings each of the Bunny and Dragon objects from
the
Stanford 3D Scanning Repository.
More objects may be added in the future, but only the Bunny and Dragon are used
in the paper. Each object is rendered with a uniformly sampled illumination from
a point on the 2-sphere, and a uniformly sampled 3D rotation. The true latent
states are provided as NumPy arrays along with the images. The lighting is given
as a 3-vector with unit norm, while the rotation is provided both as a
quaternion and a 3x3 orthogonal matrix.
There are many similarities between S3O4D and existing ML benchmark datasets
like NORB,
3D Chairs,
3D Shapes and many others, which also
include renderings of a set of objects under different pose and illumination
conditions. However, none of these existing datasets include the full manifold
of rotations in 3D - most include only a subset of changes to elevation and
azimuth. S3O4D images are sampled uniformly and independently from the full
space of rotations and illuminations, meaning the dataset contains objects that
are upside down and illuminated from behind or underneath. We believe that this
makes S3O4D uniquely suited for research on generative models where the latent
space has non-trivial topology, as well as for general manifold learning methods
where the curvature of the manifold is important.
Split |
Examples |
'bunny_test' |
20,000 |
'bunny_train' |
80,000 |
'dragon_test' |
20,000 |
'dragon_train' |
80,000 |
FeaturesDict({
'illumination': Tensor(shape=(3,), dtype=float32),
'image': Image(shape=(256, 256, 3), dtype=uint8),
'label': ClassLabel(shape=(), dtype=int64, num_classes=2),
'pose_mat': Tensor(shape=(3, 3), dtype=float32),
'pose_quat': Tensor(shape=(4,), dtype=float32),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
illumination |
Tensor |
(3,) |
float32 |
|
image |
Image |
(256, 256, 3) |
uint8 |
|
label |
ClassLabel |
|
int64 |
|
pose_mat |
Tensor |
(3, 3) |
float32 |
|
pose_quat |
Tensor |
(4,) |
float32 |
|

@article{pfau2020disentangling,
title={Disentangling by Subspace Diffusion},
author={Pfau, David and Higgins, Irina and Botev, Aleksandar and Racani\`ere,
S{\'e}bastian},
journal={Advances in Neural Information Processing Systems (NeurIPS)},
year={2020}
}
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-06-01 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-06-01 UTC."],[],[],null,["# s3o4d\n\n\u003cbr /\u003e\n\n- **Description**:\n\nThe dataset first described in the \"Stanford 3D Objects\" section of the paper\n[Disentangling by Subspace Diffusion](https://arxiv.org/abs/2006.12982). The\ndata consists of 100,000 renderings each of the Bunny and Dragon objects from\nthe\n[Stanford 3D Scanning Repository](http://graphics.stanford.edu/data/3Dscanrep/).\nMore objects may be added in the future, but only the Bunny and Dragon are used\nin the paper. Each object is rendered with a uniformly sampled illumination from\na point on the 2-sphere, and a uniformly sampled 3D rotation. The true latent\nstates are provided as NumPy arrays along with the images. The lighting is given\nas a 3-vector with unit norm, while the rotation is provided both as a\nquaternion and a 3x3 orthogonal matrix.\n\nThere are many similarities between S3O4D and existing ML benchmark datasets\nlike [NORB](https://cs.nyu.edu/%7Eylclab/data/norb-v1.0/),\n[3D Chairs](https://github.com/mathieuaubry/seeing3Dchairs),\n[3D Shapes](https://github.com/deepmind/3d-shapes) and many others, which also\ninclude renderings of a set of objects under different pose and illumination\nconditions. However, none of these existing datasets include the *full manifold*\nof rotations in 3D - most include only a subset of changes to elevation and\nazimuth. S3O4D images are sampled uniformly and independently from the full\nspace of rotations and illuminations, meaning the dataset contains objects that\nare upside down and illuminated from behind or underneath. We believe that this\nmakes S3O4D uniquely suited for research on generative models where the latent\nspace has non-trivial topology, as well as for general manifold learning methods\nwhere the curvature of the manifold is important.\n\n- **Additional Documentation** :\n [Explore on Papers With Code\n north_east](https://paperswithcode.com/dataset/s3o4d)\n\n- **Homepage** :\n \u003chttps://github.com/deepmind/deepmind-research/tree/master/geomancer#stanford-3d-objects-for-disentangling-s3o4d\u003e\n\n- **Source code** :\n [`tfds.datasets.s3o4d.Builder`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/datasets/s3o4d/s3o4d_dataset_builder.py)\n\n- **Versions**:\n\n - **`1.0.0`** (default): Initial release.\n- **Download size** : `911.68 MiB`\n\n- **Dataset size** : `1.01 GiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Splits**:\n\n| Split | Examples |\n|------------------|----------|\n| `'bunny_test'` | 20,000 |\n| `'bunny_train'` | 80,000 |\n| `'dragon_test'` | 20,000 |\n| `'dragon_train'` | 80,000 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'illumination': Tensor(shape=(3,), dtype=float32),\n 'image': Image(shape=(256, 256, 3), dtype=uint8),\n 'label': ClassLabel(shape=(), dtype=int64, num_classes=2),\n 'pose_mat': Tensor(shape=(3, 3), dtype=float32),\n 'pose_quat': Tensor(shape=(4,), dtype=float32),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|--------------|--------------|---------------|---------|-------------|\n| | FeaturesDict | | | |\n| illumination | Tensor | (3,) | float32 | |\n| image | Image | (256, 256, 3) | uint8 | |\n| label | ClassLabel | | int64 | |\n| pose_mat | Tensor | (3, 3) | float32 | |\n| pose_quat | Tensor | (4,) | float32 | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `None`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @article{pfau2020disentangling,\n title={Disentangling by Subspace Diffusion},\n author={Pfau, David and Higgins, Irina and Botev, Aleksandar and Racani\\`ere,\n S{\\'e}bastian},\n journal={Advances in Neural Information Processing Systems (NeurIPS)},\n year={2020}\n }"]]