TFDS sekarang mendukung format Croissant 🥐 ! Baca dokumentasi untuk mengetahui lebih lanjut.

Halaman ini diterjemahkan oleh Cloud Translation API.

tatoeba

Deskripsi :

Data ini disarikan dari korpus Tatoeba, tertanggal Sabtu 17/11/2018.

Untuk setiap bahasa, kami telah memilih 1000 kalimat bahasa Inggris dan terjemahannya, jika tersedia. Silakan periksa makalah ini untuk deskripsi bahasa, keluarga dan skrip mereka serta hasil dasar.

Harap dicatat bahwa kalimat bahasa Inggris tidak identik untuk semua pasangan bahasa. Ini berarti bahwa hasilnya tidak dapat dibandingkan secara langsung antar bahasa.

Beranda : http://opus.nlpl.eu/Tatoeba.php
Kode sumber : tfds.datasets.tatoeba.Builder
Versi :
- 1.0.0 (default): Rilis awal.
Di-cache otomatis ( dokumentasi ): Ya
Struktur fitur :

FeaturesDict({
    'source_language': Text(shape=(), dtype=string),
    'source_sentence': Text(shape=(), dtype=string),
    'target_language': Text(shape=(), dtype=string),
    'target_sentence': Text(shape=(), dtype=string),
})

Dokumentasi fitur :

Fitur	Kelas	Dtype
	fiturDict
bahasa sumber	Teks	rangkaian
sumber_kalimat	Teks	rangkaian
target_language	Teks	rangkaian
target_kalimat	Teks	rangkaian

Kunci yang diawasi (Lihat as_supervised doc ): None
Gambar ( tfds.show_examples ): Tidak didukung.
Kutipan :

@article{tatoeba,
          title={Massively Multilingual Sentence Embeddings for Zero-Shot
                   Cross-Lingual Transfer and Beyond},
          author={Mikel, Artetxe and Holger, Schwenk,},
          journal={arXiv:1812.10464v2},
          year={2018}
}

@InProceedings{TIEDEMANN12.463,
  author = {J{\"o}rg}rg Tiedemann},
  title = {Parallel Data, Tools and Interfaces in OPUS},
  booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)},
  year = {2012},
  month = {may},
  date = {23-25},
  address = {Istanbul, Turkey},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Ugur Dogan and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {978-2-9517408-7-7},
  language = {english}
}

tatoeba/tatoeba_af (konfigurasi default)

Ukuran unduhan : 58.24 KiB
Ukuran dataset : 162.74 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_ar

Ukuran unduhan : 70.95 KiB
Ukuran dataset : 175.46 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_bg

Ukuran unduhan : 99.88 KiB
Ukuran dataset : 204.64 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_bn

Ukuran unduhan : 89.55 KiB
Ukuran dataset : 194.24 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_de

Ukuran unduhan : 103.09 KiB
Ukuran dataset : 207.93 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_el

Ukuran unduhan : 77.11 KiB
Ukuran dataset : 181.65 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_es

Ukuran unduhan : 70.57 KiB
Ukuran dataset : 175.12 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_et

Ukuran unduhan : 58.33 KiB
Ukuran dataset : 162.85 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_eu

Ukuran unduhan : 64.52 KiB
Ukuran dataset : 169.02 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_fa

Ukuran unduhan : 91.52 KiB
Ukuran dataset : 196.15 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_fi

Ukuran unduhan : 73.90 KiB
Ukuran dataset : 178.47 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_fr

Ukuran unduhan : 78.14 KiB
Ukuran dataset : 182.68 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_he

Ukuran unduhan : 81.54 KiB
Ukuran dataset : 186.15 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_hi

Ukuran unduhan : 119.69 KiB
Ukuran dataset : 224.89 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_hu

Ukuran unduhan : 67.27 KiB
Ukuran dataset : 171.78 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_id

Ukuran unduhan : 73.09 KiB
Ukuran dataset : 177.61 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_it

Ukuran unduhan : 64.29 KiB
Ukuran dataset : 168.81 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_ja

Ukuran unduhan : 90.90 KiB
Ukuran dataset : 195.53 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_jv

Ukuran unduhan : 13.59 KiB
Ukuran dataset : 35.01 KiB
Perpecahan :

Membelah	Contoh
`'train'`	205

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_ka

Ukuran unduhan : 70.47 KiB
Ukuran dataset : 148.67 KiB
Perpecahan :

Membelah	Contoh
`'train'`	746

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_kk

Ukuran unduhan : 46.07 KiB
Ukuran dataset : 106.25 KiB
Perpecahan :

Membelah	Contoh
`'train'`	575

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_ko

Ukuran unduhan : 77.28 KiB
Ukuran dataset : 181.88 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_ml

Ukuran unduhan : 92.50 KiB
Ukuran dataset : 165.14 KiB
Perpecahan :

Membelah	Contoh
`'train'`	687

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_mr

Ukuran unduhan : 98.19 KiB
Ukuran dataset : 202.96 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_nl

Ukuran unduhan : 71.55 KiB
Ukuran dataset : 176.10 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_pt

Ukuran unduhan : 73.42 KiB
Ukuran dataset : 177.95 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_ru

Ukuran unduhan : 90.30 KiB
Ukuran dataset : 194.92 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_sw

Ukuran unduhan : 19.99 KiB
Ukuran dataset : 60.75 KiB
Perpecahan :

Membelah	Contoh
`'train'`	390

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_ta

Ukuran unduhan : 38.52 KiB
Ukuran dataset : 70.93 KiB
Perpecahan :

Membelah	Contoh
`'train'`	307

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_te

Ukuran unduhan : 24.55 KiB
Ukuran dataset : 49.07 KiB
Perpecahan :

Membelah	Contoh
`'train'`	234

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_th

Ukuran unduhan : 61.72 KiB
Ukuran dataset : 119.32 KiB
Perpecahan :

Membelah	Contoh
`'train'`	548

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_tl

Ukuran unduhan : 66.54 KiB
Ukuran dataset : 171.04 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_tr

Ukuran unduhan : 70.20 KiB
Ukuran dataset : 174.70 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_ur

Ukuran unduhan : 86.63 KiB
Ukuran dataset : 191.20 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_vi

Ukuran unduhan : 89.26 KiB
Ukuran dataset : 193.89 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba/tatoeba_zh

Ukuran unduhan : 67.32 KiB
Ukuran dataset : 171.85 KiB
Perpecahan :

Membelah	Contoh
`'train'`	1.000

Contoh ( tfds.as_dataframe ):

tatoeba Tetap teratur dengan koleksi Simpan dan kategorikan konten berdasarkan preferensi Anda.