Datasets for distributed denial-of-service detection in healthcare internet of things environments

Research output: Contribution to journalArticlepeer-review

Abstract

The growing number of Internet of Things (IoT) devices in healthcare settings raises critical concerns about security, particularly in defending against Distributed Denial-of-Service (DDoS) attacks. These attacks can cause operational downtime in IoT environments. To mitigate DDoS-based attacks, advanced defense-in-depth strategies and well-labeled datasets are required for Healthcare-IoT (H-IoT), IoT, and other distributed computing contexts. This article presents two labeled datasets, UL-ECE-MQTT-DDoS-H-IoT2025 and UL-ECE-UDP-DDoS-H-IoT2025 , generated by simulating realistic traffic patterns under both normal and DDoS conditions using the Cooja and ns-3 simulators. In Cooja, the raw dataset records healthcare-specific Message Queuing Telemetry Transport (MQTT) traffic (e.g., simulated oxygen level of 100 %) randomly generated by emulated H-IoT sensors. It also includes message counts and network metadata that enable detailed analysis across both application and network layers. In ns-3, the raw data comprises 5G-enabled H-IoT network traces from all nodes, capturing timestamps, payload size, and the header details of the User Datagram Protocol (UDP). Existing benchmark datasets mainly consist of generic network traffic attributes, including packet IDs, protocol types, and timestamps. In contrast, the proposed datasets address this gap by incorporating H-IoT-specific communication parameters that closely resemble real-world conditions, such as node-level message counts and monitoring frequencies. This inclusion provides a realistic representation of communication patterns for security and performance research in H-IoT. The datasets enable detailed analysis of key features for detecting DDoS threats, including UDP flood variants extending beyond the H-IoT domain. This characteristic makes them directly usable for developing, testing, and comparing machine learning (ML) and deep learning (DL) models across diverse IoT security contexts. The MQTT-based dataset is derived from a 5-hour simulation run using the Cooja simulator, which emulates wearable sensors such as body temperature, heart rate, and oxygen saturation. In this setup, normal H-IoT nodes transmit data to the server at 60-second intervals, while DDoS-affected nodes publish data at 20-second intervals to simulate a higher transmission frequency. The UDP-based dataset is derived from a 120-second simulation conducted using the ns-3 simulator, which simulates a 5G-enabled H-IoT environment. In this scenario, normal and malicious nodes transmit data at 124 kbps and 248 kbps, respectively. Both datasets are processed from raw simulation logs converted into structured CSV files using Python scripts. The CSV files contain features such as timestamp, payload size, message frequency, and node-level communication statistics. The UL-ECE-MQTT-DDoS-H-IoT2025 and UL-ECE-UDP-DDoS-H-IoT2025 datasets contain approximately 20,080 and 99,887 records, respectively. The primary objective of creating these datasets is to enhance security in healthcare IoT ecosystems by enabling robust detection of advanced cyber threats. In line with this objective, the datasets support the development of ML/DL-based cybersecurity mechanisms. In addition, this resource forms a foundation for future research, motivating the creation of new datasets for emerging attack scenarios.

Original languageEnglish
Article number112222
JournalData in Brief
Volume63
DOIs
Publication statusPublished - Dec 2025

Keywords

  • Anomaly detection
  • Cooja simulator
  • Cybersecurity
  • IoT traffic features
  • Machine Learning for cyber defense
  • Network simulation
  • Ns-3 simulator
  • Traffic analysis

Fingerprint

Dive into the research topics of 'Datasets for distributed denial-of-service detection in healthcare internet of things environments'. Together they form a unique fingerprint.

Cite this