stanford_kuka_multimodal_dataset_converted_externally_to_rlds
Stay organized with collections
Save and categorize content based on your preferences.
Kuka iiwa peg insertion with force feedback
Split |
Examples |
'train' |
3,000 |
FeaturesDict({
'episode_metadata': FeaturesDict({
}),
'steps': Dataset({
'action': Tensor(shape=(4,), dtype=float32, description=Robot action, consists of [3x EEF position, 1x gripper open/close].),
'discount': Scalar(shape=(), dtype=float32, description=Discount if provided, default to 1.),
'is_first': bool,
'is_last': bool,
'is_terminal': bool,
'language_embedding': Tensor(shape=(512,), dtype=float32, description=Kona language embedding. See https://tfhub.dev/google/universal-sentence-encoder-large/5),
'language_instruction': Text(shape=(), dtype=string),
'observation': FeaturesDict({
'contact': Tensor(shape=(50,), dtype=float32, description=Robot contact information.),
'depth_image': Tensor(shape=(128, 128, 1), dtype=float32, description=Main depth camera observation.),
'ee_forces_continuous': Tensor(shape=(50, 6), dtype=float32, description=Robot end-effector forces.),
'ee_orientation': Tensor(shape=(4,), dtype=float32, description=Robot end-effector orientation quaternion.),
'ee_orientation_vel': Tensor(shape=(3,), dtype=float32, description=Robot end-effector orientation velocity.),
'ee_position': Tensor(shape=(3,), dtype=float32, description=Robot end-effector position.),
'ee_vel': Tensor(shape=(3,), dtype=float32, description=Robot end-effector velocity.),
'ee_yaw': Tensor(shape=(4,), dtype=float32, description=Robot end-effector yaw.),
'ee_yaw_delta': Tensor(shape=(4,), dtype=float32, description=Robot end-effector yaw delta.),
'image': Image(shape=(128, 128, 3), dtype=uint8, description=Main camera RGB observation.),
'joint_pos': Tensor(shape=(7,), dtype=float32, description=Robot joint positions.),
'joint_vel': Tensor(shape=(7,), dtype=float32, description=Robot joint velocities.),
'optical_flow': Tensor(shape=(128, 128, 2), dtype=float32, description=Optical flow.),
'state': Tensor(shape=(8,), dtype=float32, description=Robot proprioceptive information, [7x joint pos, 1x gripper open/close].),
}),
'reward': Scalar(shape=(), dtype=float32, description=Reward if provided, 1 on final step for demos.),
}),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
episode_metadata |
FeaturesDict |
|
|
|
steps |
Dataset |
|
|
|
steps/action |
Tensor |
(4,) |
float32 |
Robot action, consists of [3x EEF position, 1x gripper open/close]. |
steps/discount |
Scalar |
|
float32 |
Discount if provided, default to 1. |
steps/is_first |
Tensor |
|
bool |
|
steps/is_last |
Tensor |
|
bool |
|
steps/is_terminal |
Tensor |
|
bool |
|
steps/language_embedding |
Tensor |
(512,) |
float32 |
Kona language embedding. See https://tfhub.dev/google/universal-sentence-encoder-large/5 |
steps/language_instruction |
Text |
|
string |
Language Instruction. |
steps/observation |
FeaturesDict |
|
|
|
steps/observation/contact |
Tensor |
(50,) |
float32 |
Robot contact information. |
steps/observation/depth_image |
Tensor |
(128, 128, 1) |
float32 |
Main depth camera observation. |
steps/observation/ee_forces_continuous |
Tensor |
(50, 6) |
float32 |
Robot end-effector forces. |
steps/observation/ee_orientation |
Tensor |
(4,) |
float32 |
Robot end-effector orientation quaternion. |
steps/observation/ee_orientation_vel |
Tensor |
(3,) |
float32 |
Robot end-effector orientation velocity. |
steps/observation/ee_position |
Tensor |
(3,) |
float32 |
Robot end-effector position. |
steps/observation/ee_vel |
Tensor |
(3,) |
float32 |
Robot end-effector velocity. |
steps/observation/ee_yaw |
Tensor |
(4,) |
float32 |
Robot end-effector yaw. |
steps/observation/ee_yaw_delta |
Tensor |
(4,) |
float32 |
Robot end-effector yaw delta. |
steps/observation/image |
Image |
(128, 128, 3) |
uint8 |
Main camera RGB observation. |
steps/observation/joint_pos |
Tensor |
(7,) |
float32 |
Robot joint positions. |
steps/observation/joint_vel |
Tensor |
(7,) |
float32 |
Robot joint velocities. |
steps/observation/optical_flow |
Tensor |
(128, 128, 2) |
float32 |
Optical flow. |
steps/observation/state |
Tensor |
(8,) |
float32 |
Robot proprioceptive information, [7x joint pos, 1x gripper open/close]. |
steps/reward |
Scalar |
|
float32 |
Reward if provided, 1 on final step for demos. |
@inproceedings{lee2019icra,
title={Making sense of vision and touch: Self-supervised learning of multimodal representations for contact-rich tasks},
author={Lee, Michelle A and Zhu, Yuke and Srinivasan, Krishnan and Shah, Parth and Savarese, Silvio and Fei-Fei, Li and Garg, Animesh and Bohg, Jeannette},
booktitle={2019 IEEE International Conference on Robotics and Automation (ICRA)},
year={2019},
url={https://arxiv.org/abs/1810.10191}
}
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-12-11 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-12-11 UTC."],[],[],null,["# stanford_kuka_multimodal_dataset_converted_externally_to_rlds\n\n\u003cbr /\u003e\n\n- **Description**:\n\nKuka iiwa peg insertion with force feedback\n\n- **Homepage** :\n \u003chttps://sites.google.com/view/visionandtouch\u003e\n\n- **Source code** :\n [`tfds.robotics.rtx.StanfordKukaMultimodalDatasetConvertedExternallyToRlds`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/robotics/rtx/rtx.py)\n\n- **Versions**:\n\n - **`0.1.0`** (default): Initial release.\n- **Download size** : `Unknown size`\n\n- **Dataset size** : `31.98 GiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 3,000 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'episode_metadata': FeaturesDict({\n }),\n 'steps': Dataset({\n 'action': Tensor(shape=(4,), dtype=float32, description=Robot action, consists of [3x EEF position, 1x gripper open/close].),\n 'discount': Scalar(shape=(), dtype=float32, description=Discount if provided, default to 1.),\n 'is_first': bool,\n 'is_last': bool,\n 'is_terminal': bool,\n 'language_embedding': Tensor(shape=(512,), dtype=float32, description=Kona language embedding. See https://tfhub.dev/google/universal-sentence-encoder-large/5),\n 'language_instruction': Text(shape=(), dtype=string),\n 'observation': FeaturesDict({\n 'contact': Tensor(shape=(50,), dtype=float32, description=Robot contact information.),\n 'depth_image': Tensor(shape=(128, 128, 1), dtype=float32, description=Main depth camera observation.),\n 'ee_forces_continuous': Tensor(shape=(50, 6), dtype=float32, description=Robot end-effector forces.),\n 'ee_orientation': Tensor(shape=(4,), dtype=float32, description=Robot end-effector orientation quaternion.),\n 'ee_orientation_vel': Tensor(shape=(3,), dtype=float32, description=Robot end-effector orientation velocity.),\n 'ee_position': Tensor(shape=(3,), dtype=float32, description=Robot end-effector position.),\n 'ee_vel': Tensor(shape=(3,), dtype=float32, description=Robot end-effector velocity.),\n 'ee_yaw': Tensor(shape=(4,), dtype=float32, description=Robot end-effector yaw.),\n 'ee_yaw_delta': Tensor(shape=(4,), dtype=float32, description=Robot end-effector yaw delta.),\n 'image': Image(shape=(128, 128, 3), dtype=uint8, description=Main camera RGB observation.),\n 'joint_pos': Tensor(shape=(7,), dtype=float32, description=Robot joint positions.),\n 'joint_vel': Tensor(shape=(7,), dtype=float32, description=Robot joint velocities.),\n 'optical_flow': Tensor(shape=(128, 128, 2), dtype=float32, description=Optical flow.),\n 'state': Tensor(shape=(8,), dtype=float32, description=Robot proprioceptive information, [7x joint pos, 1x gripper open/close].),\n }),\n 'reward': Scalar(shape=(), dtype=float32, description=Reward if provided, 1 on final step for demos.),\n }),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|----------------------------------------|--------------|---------------|---------|--------------------------------------------------------------------------------------------|\n| | FeaturesDict | | | |\n| episode_metadata | FeaturesDict | | | |\n| steps | Dataset | | | |\n| steps/action | Tensor | (4,) | float32 | Robot action, consists of \\[3x EEF position, 1x gripper open/close\\]. |\n| steps/discount | Scalar | | float32 | Discount if provided, default to 1. |\n| steps/is_first | Tensor | | bool | |\n| steps/is_last | Tensor | | bool | |\n| steps/is_terminal | Tensor | | bool | |\n| steps/language_embedding | Tensor | (512,) | float32 | Kona language embedding. See \u003chttps://tfhub.dev/google/universal-sentence-encoder-large/5\u003e |\n| steps/language_instruction | Text | | string | Language Instruction. |\n| steps/observation | FeaturesDict | | | |\n| steps/observation/contact | Tensor | (50,) | float32 | Robot contact information. |\n| steps/observation/depth_image | Tensor | (128, 128, 1) | float32 | Main depth camera observation. |\n| steps/observation/ee_forces_continuous | Tensor | (50, 6) | float32 | Robot end-effector forces. |\n| steps/observation/ee_orientation | Tensor | (4,) | float32 | Robot end-effector orientation quaternion. |\n| steps/observation/ee_orientation_vel | Tensor | (3,) | float32 | Robot end-effector orientation velocity. |\n| steps/observation/ee_position | Tensor | (3,) | float32 | Robot end-effector position. |\n| steps/observation/ee_vel | Tensor | (3,) | float32 | Robot end-effector velocity. |\n| steps/observation/ee_yaw | Tensor | (4,) | float32 | Robot end-effector yaw. |\n| steps/observation/ee_yaw_delta | Tensor | (4,) | float32 | Robot end-effector yaw delta. |\n| steps/observation/image | Image | (128, 128, 3) | uint8 | Main camera RGB observation. |\n| steps/observation/joint_pos | Tensor | (7,) | float32 | Robot joint positions. |\n| steps/observation/joint_vel | Tensor | (7,) | float32 | Robot joint velocities. |\n| steps/observation/optical_flow | Tensor | (128, 128, 2) | float32 | Optical flow. |\n| steps/observation/state | Tensor | (8,) | float32 | Robot proprioceptive information, \\[7x joint pos, 1x gripper open/close\\]. |\n| steps/reward | Scalar | | float32 | Reward if provided, 1 on final step for demos. |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `None`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Examples**\n ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @inproceedings{lee2019icra,\n title={Making sense of vision and touch: Self-supervised learning of multimodal representations for contact-rich tasks},\n author={Lee, Michelle A and Zhu, Yuke and Srinivasan, Krishnan and Shah, Parth and Savarese, Silvio and Fei-Fei, Li and Garg, Animesh and Bohg, Jeannette},\n booktitle={2019 IEEE International Conference on Robotics and Automation (ICRA)},\n year={2019},\n url={https://arxiv.org/abs/1810.10191}\n }"]]