- Description:
 
This is the data set used for The Third International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-99 The Fifth International Conference on Knowledge Discovery and Data Mining. The competition task was to build a network intrusion detector, a predictive model capable of distinguishing between 'bad' connections, called intrusions or attacks, and 'good' normal connections. This database contains a standard set of data to be audited, which includes a wide variety of intrusions simulated in a military network environment.
Additional Documentation: Explore on Papers With Code
Homepage: https://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
Source code:
tfds.datasets.kddcup99.BuilderVersions:
1.0.0: Initial release.1.0.1(default): Fixes parsing of boolean fieldsland,logged_in,root_shell,is_hot_loginandis_guest_login.
Download size:
18.62 MiBDataset size:
5.25 GiBAuto-cached (documentation): No
Splits:
| Split | Examples | 
|---|---|
'test' | 
311,029 | 
'train' | 
4,898,431 | 
- Feature structure:
 
FeaturesDict({
    'count': int32,
    'diff_srv_rate': float32,
    'dst_bytes': int32,
    'dst_host_count': int32,
    'dst_host_diff_srv_rate': float32,
    'dst_host_rerror_rate': float32,
    'dst_host_same_src_port_rate': float32,
    'dst_host_same_srv_rate': float32,
    'dst_host_serror_rate': float32,
    'dst_host_srv_count': int32,
    'dst_host_srv_diff_host_rate': float32,
    'dst_host_srv_rerror_rate': float32,
    'dst_host_srv_serror_rate': float32,
    'duration': int32,
    'flag': ClassLabel(shape=(), dtype=int64, num_classes=11),
    'hot': int32,
    'is_guest_login': bool,
    'is_hot_login': bool,
    'label': ClassLabel(shape=(), dtype=int64, num_classes=40),
    'land': bool,
    'logged_in': bool,
    'num_access_files': int32,
    'num_compromised': int32,
    'num_failed_logins': int32,
    'num_file_creations': int32,
    'num_outbound_cmds': int32,
    'num_root': int32,
    'num_shells': int32,
    'protocol_type': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'rerror_rate': float32,
    'root_shell': bool,
    'same_srv_rate': float32,
    'serror_rate': float32,
    'service': ClassLabel(shape=(), dtype=int64, num_classes=71),
    'src_bytes': int32,
    'srv_count': int32,
    'srv_diff_host_rate': float32,
    'srv_rerror_rate': float32,
    'srv_serror_rate': float32,
    'su_attempted': int32,
    'urgent': int32,
    'wrong_fragment': int32,
})
- Feature documentation:
 
| Feature | Class | Shape | Dtype | Description | 
|---|---|---|---|---|
| FeaturesDict | ||||
| count | Tensor | int32 | ||
| diff_srv_rate | Tensor | float32 | ||
| dst_bytes | Tensor | int32 | ||
| dst_host_count | Tensor | int32 | ||
| dst_host_diff_srv_rate | Tensor | float32 | ||
| dst_host_rerror_rate | Tensor | float32 | ||
| dst_host_same_src_port_rate | Tensor | float32 | ||
| dst_host_same_srv_rate | Tensor | float32 | ||
| dst_host_serror_rate | Tensor | float32 | ||
| dst_host_srv_count | Tensor | int32 | ||
| dst_host_srv_diff_host_rate | Tensor | float32 | ||
| dst_host_srv_rerror_rate | Tensor | float32 | ||
| dst_host_srv_serror_rate | Tensor | float32 | ||
| duration | Tensor | int32 | ||
| flag | ClassLabel | int64 | ||
| hot | Tensor | int32 | ||
| is_guest_login | Tensor | bool | ||
| is_hot_login | Tensor | bool | ||
| label | ClassLabel | int64 | ||
| land | Tensor | bool | ||
| logged_in | Tensor | bool | ||
| num_access_files | Tensor | int32 | ||
| num_compromised | Tensor | int32 | ||
| num_failed_logins | Tensor | int32 | ||
| num_file_creations | Tensor | int32 | ||
| num_outbound_cmds | Tensor | int32 | ||
| num_root | Tensor | int32 | ||
| num_shells | Tensor | int32 | ||
| protocol_type | ClassLabel | int64 | ||
| rerror_rate | Tensor | float32 | ||
| root_shell | Tensor | bool | ||
| same_srv_rate | Tensor | float32 | ||
| serror_rate | Tensor | float32 | ||
| service | ClassLabel | int64 | ||
| src_bytes | Tensor | int32 | ||
| srv_count | Tensor | int32 | ||
| srv_diff_host_rate | Tensor | float32 | ||
| srv_rerror_rate | Tensor | float32 | ||
| srv_serror_rate | Tensor | float32 | ||
| su_attempted | Tensor | int32 | ||
| urgent | Tensor | int32 | ||
| wrong_fragment | Tensor | int32 | 
Supervised keys (See
as_superviseddoc):NoneFigure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):
- Citation:
 
@misc{Dua:2019 ,
  author = "Dua, Dheeru and Graff, Casey",
  year = 2017,
  title = "{UCI} Machine Learning Repository",
  url = "http://archive.ics.uci.edu/ml",
  institution = "University of California, Irvine, School of Information and
Computer Sciences"
}