forest_fires
Stay organized with collections
Save and categorize content based on your preferences.
This is a regression task, where the aim is to predict the burned area of forest
fires, in the northeast region of Portugal, by using meteorological and other
data.
Data Set Information:
In [Cortez and Morais, 2007], the output 'area' was first transformed with a
ln(x+1) function. Then, several Data Mining methods were applied. After fitting
the models, the outputs were post-processed with the inverse of the ln(x+1)
transform. Four different input setups were used. The experiments were conducted
using a 10-fold (cross-validation) x 30 runs. Two regression metrics were
measured: MAD and RMSE. A Gaussian support vector machine (SVM) fed with only 4
direct weather conditions (temp, RH, wind and rain) obtained the best MAD value:
12.71 +- 0.01 (mean and confidence interval within 95% using a t-student
distribution). The best RMSE was attained by the naive mean predictor. An
analysis to the regression error curve (REC) shows that the SVM model predicts
more examples within a lower admitted error. In effect, the SVM model predicts
better small fires, which are the majority.
Attribute Information:
For more information, read [Cortez and Morais, 2007].
- X - x-axis spatial coordinate within the Montesinho park map: 1 to 9
- Y - y-axis spatial coordinate within the Montesinho park map: 2 to 9
- month - month of the year: 'jan' to 'dec'
- day - day of the week: 'mon' to 'sun'
- FFMC - FFMC index from the FWI system: 18.7 to 96.20
- DMC - DMC index from the FWI system: 1.1 to 291.3
- DC - DC index from the FWI system: 7.9 to 860.6
- ISI - ISI index from the FWI system: 0.0 to 56.10
- temp - temperature in Celsius degrees: 2.2 to 33.30
- RH - relative humidity in %: 15.0 to 100
- wind - wind speed in km/h: 0.40 to 9.40
- rain - outside rain in mm/m2 : 0.0 to 6.4
- area - the burned area of the forest (in ha): 0.00 to 1090.84 (this output
variable is very skewed towards 0.0, thus it may make sense to model with
the logarithm transform).
Split |
Examples |
'train' |
517 |
FeaturesDict({
'area': float32,
'features': FeaturesDict({
'DC': float32,
'DMC': float32,
'FFMC': float32,
'ISI': float32,
'RH': float32,
'X': uint8,
'Y': uint8,
'day': ClassLabel(shape=(), dtype=int64, num_classes=7),
'month': ClassLabel(shape=(), dtype=int64, num_classes=12),
'rain': float32,
'temp': float32,
'wind': float32,
}),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
area |
Tensor |
|
float32 |
|
features |
FeaturesDict |
|
|
|
features/DC |
Tensor |
|
float32 |
|
features/DMC |
Tensor |
|
float32 |
|
features/FFMC |
Tensor |
|
float32 |
|
features/ISI |
Tensor |
|
float32 |
|
features/RH |
Tensor |
|
float32 |
|
features/X |
Tensor |
|
uint8 |
|
features/Y |
Tensor |
|
uint8 |
|
features/day |
ClassLabel |
|
int64 |
|
features/month |
ClassLabel |
|
int64 |
|
features/rain |
Tensor |
|
float32 |
|
features/temp |
Tensor |
|
float32 |
|
features/wind |
Tensor |
|
float32 |
|
@misc{Dua:2019 ,
author = "Dua, Dheeru and Graff, Casey",
year = "2017",
title = "{UCI} Machine Learning Repository",
url = "http://archive.ics.uci.edu/ml",
institution = "University of California, Irvine, School of Information and Computer Sciences" }
@article{cortez2007data,
title={A data mining approach to predict forest fires using meteorological data},
author={Cortez, Paulo and Morais, Anibal de Jesus Raimundo},
year={2007},
publisher={Associa{\c{c} }{\~a}o Portuguesa para a Intelig{\^e}ncia Artificial (APPIA)}
}
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-11-23 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2022-11-23 UTC."],[],[],null,["# forest_fires\n\n\u003cbr /\u003e\n\n- **Description**:\n\nThis is a regression task, where the aim is to predict the burned area of forest\nfires, in the northeast region of Portugal, by using meteorological and other\ndata.\n\nData Set Information:\n\nIn \\[Cortez and Morais, 2007\\], the output 'area' was first transformed with a\nln(x+1) function. Then, several Data Mining methods were applied. After fitting\nthe models, the outputs were post-processed with the inverse of the ln(x+1)\ntransform. Four different input setups were used. The experiments were conducted\nusing a 10-fold (cross-validation) x 30 runs. Two regression metrics were\nmeasured: MAD and RMSE. A Gaussian support vector machine (SVM) fed with only 4\ndirect weather conditions (temp, RH, wind and rain) obtained the best MAD value:\n12.71 +- 0.01 (mean and confidence interval within 95% using a t-student\ndistribution). The best RMSE was attained by the naive mean predictor. An\nanalysis to the regression error curve (REC) shows that the SVM model predicts\nmore examples within a lower admitted error. In effect, the SVM model predicts\nbetter small fires, which are the majority.\n\nAttribute Information:\n\nFor more information, read \\[Cortez and Morais, 2007\\].\n\n1. X - x-axis spatial coordinate within the Montesinho park map: 1 to 9\n2. Y - y-axis spatial coordinate within the Montesinho park map: 2 to 9\n3. month - month of the year: 'jan' to 'dec'\n4. day - day of the week: 'mon' to 'sun'\n5. FFMC - FFMC index from the FWI system: 18.7 to 96.20\n6. DMC - DMC index from the FWI system: 1.1 to 291.3\n7. DC - DC index from the FWI system: 7.9 to 860.6\n8. ISI - ISI index from the FWI system: 0.0 to 56.10\n9. temp - temperature in Celsius degrees: 2.2 to 33.30\n10. RH - relative humidity in %: 15.0 to 100\n11. wind - wind speed in km/h: 0.40 to 9.40\n12. rain - outside rain in mm/m2 : 0.0 to 6.4\n13. area - the burned area of the forest (in ha): 0.00 to 1090.84 (this output variable is very skewed towards 0.0, thus it may make sense to model with the logarithm transform).\n\n- **Homepage** :\n \u003chttps://archive.ics.uci.edu/ml/datasets/Forest+Fires\u003e\n\n- **Source code** :\n [`tfds.structured.ForestFires`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/structured/forest_fires.py)\n\n- **Versions**:\n\n - **`0.0.1`** (default): No release notes.\n- **Download size** : `24.88 KiB`\n\n- **Dataset size** : `162.07 KiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n Yes\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 517 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'area': float32,\n 'features': FeaturesDict({\n 'DC': float32,\n 'DMC': float32,\n 'FFMC': float32,\n 'ISI': float32,\n 'RH': float32,\n 'X': uint8,\n 'Y': uint8,\n 'day': ClassLabel(shape=(), dtype=int64, num_classes=7),\n 'month': ClassLabel(shape=(), dtype=int64, num_classes=12),\n 'rain': float32,\n 'temp': float32,\n 'wind': float32,\n }),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|----------------|--------------|-------|---------|-------------|\n| | FeaturesDict | | | |\n| area | Tensor | | float32 | |\n| features | FeaturesDict | | | |\n| features/DC | Tensor | | float32 | |\n| features/DMC | Tensor | | float32 | |\n| features/FFMC | Tensor | | float32 | |\n| features/ISI | Tensor | | float32 | |\n| features/RH | Tensor | | float32 | |\n| features/X | Tensor | | uint8 | |\n| features/Y | Tensor | | uint8 | |\n| features/day | ClassLabel | | int64 | |\n| features/month | ClassLabel | | int64 | |\n| features/rain | Tensor | | float32 | |\n| features/temp | Tensor | | float32 | |\n| features/wind | Tensor | | float32 | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `('area', 'features')`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Examples**\n ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @misc{Dua:2019 ,\n author = \"Dua, Dheeru and Graff, Casey\",\n year = \"2017\",\n title = \"{UCI} Machine Learning Repository\",\n url = \"http://archive.ics.uci.edu/ml\",\n institution = \"University of California, Irvine, School of Information and Computer Sciences\" }\n\n @article{cortez2007data,\n title={A data mining approach to predict forest fires using meteorological data},\n author={Cortez, Paulo and Morais, Anibal de Jesus Raimundo},\n year={2007},\n publisher={Associa{\\c{c} }{\\~a}o Portuguesa para a Intelig{\\^e}ncia Artificial (APPIA)}\n }"]]