xed_en_fi

مراجع:

en_notated

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:xed_en_fi/en_annotated')
  • توضیحات :
A multilingual fine-grained emotion dataset. The dataset consists of human annotated Finnish (25k) and English sentences (30k). Plutchiks
core emotions are used to annotate the dataset with the addition of neutral to create a multilabel multiclass
dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to
show that XED performs on par with other similar datasets and is therefore a useful tool for
sentiment analysis and emotion detection.
  • مجوز : مجوز: Creative Commons Attribution 4.0 International License (CC-BY)
  • نسخه : 1.1.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'train' 17528
  • ویژگی ها :
{
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 9,
            "names": [
                "neutral",
                "anger",
                "anticipation",
                "disgust",
                "fear",
                "joy",
                "sadness",
                "surprise",
                "trust"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

en_neutral

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:xed_en_fi/en_neutral')
  • توضیحات :
A multilingual fine-grained emotion dataset. The dataset consists of human annotated Finnish (25k) and English sentences (30k). Plutchiks
core emotions are used to annotate the dataset with the addition of neutral to create a multilabel multiclass
dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to
show that XED performs on par with other similar datasets and is therefore a useful tool for
sentiment analysis and emotion detection.
  • مجوز : مجوز: Creative Commons Attribution 4.0 International License (CC-BY)
  • نسخه : 1.1.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'train' 9675
  • ویژگی ها :
{
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "num_classes": 9,
        "names": [
            "neutral",
            "anger",
            "anticipation",
            "disgust",
            "fear",
            "joy",
            "sadness",
            "surprise",
            "trust"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

fi_notated

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:xed_en_fi/fi_annotated')
  • توضیحات :
A multilingual fine-grained emotion dataset. The dataset consists of human annotated Finnish (25k) and English sentences (30k). Plutchiks
core emotions are used to annotate the dataset with the addition of neutral to create a multilabel multiclass
dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to
show that XED performs on par with other similar datasets and is therefore a useful tool for
sentiment analysis and emotion detection.
  • مجوز : مجوز: Creative Commons Attribution 4.0 International License (CC-BY)
  • نسخه : 1.1.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'train' 14449
  • ویژگی ها :
{
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 9,
            "names": [
                "neutral",
                "anger",
                "anticipation",
                "disgust",
                "fear",
                "joy",
                "sadness",
                "surprise",
                "trust"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

fi_neutral

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:xed_en_fi/fi_neutral')
  • توضیحات :
A multilingual fine-grained emotion dataset. The dataset consists of human annotated Finnish (25k) and English sentences (30k). Plutchiks
core emotions are used to annotate the dataset with the addition of neutral to create a multilabel multiclass
dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to
show that XED performs on par with other similar datasets and is therefore a useful tool for
sentiment analysis and emotion detection.
  • مجوز : مجوز: Creative Commons Attribution 4.0 International License (CC-BY)
  • نسخه : 1.1.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'train' 10794
  • ویژگی ها :
{
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "num_classes": 9,
        "names": [
            "neutral",
            "anger",
            "anticipation",
            "disgust",
            "fear",
            "joy",
            "sadness",
            "surprise",
            "trust"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}