機転

説明:

ウィキペディアベースの画像テキスト (WIT) データセットは、大規模なマルチモーダル多言語データセットです。 WIT は、108 のウィキペディア言語にわたる 1,150 万の一意の画像を含む、3,760 万のエンティティが豊富な画像テキストの例の精選されたセットで構成されています。そのサイズにより、WIT をマルチモーダル機械学習モデルの事前トレーニングデータセットとして使用できます。

追加のドキュメント:コードを使用したペーパーの探索
ホームページ: https://github.com/google-research-datasets/wit/
ソースコード: tfds.vision_language.wit.Wit
バージョン:
- 1.0.0 : 初期リリース。 https://storage.googleapis.com/gresearch/wit/から WIT データセットを読み込みます
- 1.1.0 (デフォルト): valとtestの分割が追加されました。
ダウンロードサイズ: 25.20 GiB
データセットサイズ: 81.17 GiB
自動キャッシュ(ドキュメント): いいえ
スプリット:

スプリット	例
`'test'`	210,166
`'train'`	37,046,386
`'val'`	261,024

機能構造:

FeaturesDict({
    'attribution_passes_lang_id': bool,
    'caption_alt_text_description': Text(shape=(), dtype=string),
    'caption_attribution_description': Text(shape=(), dtype=string),
    'caption_reference_description': Text(shape=(), dtype=string),
    'context_page_description': Text(shape=(), dtype=string),
    'context_section_description': Text(shape=(), dtype=string),
    'hierarchical_section_title': Text(shape=(), dtype=string),
    'image_url': Text(shape=(), dtype=string),
    'is_main_image': bool,
    'language': Text(shape=(), dtype=string),
    'mime_type': Text(shape=(), dtype=string),
    'original_height': int32,
    'original_width': int32,
    'page_changed_recently': bool,
    'page_title': Text(shape=(), dtype=string),
    'page_url': Text(shape=(), dtype=string),
    'section_title': Text(shape=(), dtype=string),
})

機能のドキュメント:

特徴	クラス	Dtype
	特徴辞書
attribution_passes_lang_id	テンソル	ブール
caption_alt_text_description	文章	ストリング
caption_attribution_description	文章	ストリング
キャプション_参照_説明	文章	ストリング
context_page_description	文章	ストリング
context_section_description	文章	ストリング
hierarchy_section_title	文章	ストリング
image_url	文章	ストリング
is_main_image	テンソル	ブール
言語	文章	ストリング
mime_type	文章	ストリング
オリジナルの高さ	テンソル	int32
original_width	テンソル	int32
page_changed_recently	テンソル	ブール
ページタイトル	文章	ストリング
page_url	文章	ストリング
section_title	文章	ストリング

監視されたキー( as_supervised docを参照): None
図( tfds.show_examples ): サポートされていません。
例( tfds.as_dataframe ):

引用：

@article{srinivasan2021wit,
  title={WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning},
  author={Srinivasan, Krishna and Raman, Karthik and Chen, Jiecao and Bendersky, Michael and Najork, Marc},
  journal={arXiv preprint arXiv:2103.01913},
  year={2021}
}

機転 コレクションでコンテンツを整理 必要に応じて、コンテンツの保存と分類を行います。

機転