An overview of all benchmark datasets used across the four frameworks, with access instructions and usage notes. We prioritize well-established public datasets that appear in IEEE TIE, TII, and PHM conference papers.
Predictive Maintenance Datasets
Used in: Industrial-Predictive-Maintenance
NASA CMAPSS / N-CMAPSS
| Property |
Value |
| Source |
NASA PCoE |
| Domain |
Turbofan engine degradation simulation |
| Subsets |
FD001–FD004 (CMAPSS), DS01–DS08 (N-CMAPSS 2021) |
| Signals |
21 sensor channels + 3 operational settings |
| Task |
Remaining Useful Life (RUL) regression |
| Size |
~70K engine cycles (CMAPSS) |
| Access |
Free download |
Note: N-CMAPSS (2021) is the updated dataset used in recent IEEE TII papers and the PHM 2021 challenge. It differs from the original CMAPSS in fault modes and sensor resolution. Always specify which version you use when comparing published results.
Download via the provided loader:
python datasets/cmapss_loader.py # auto-downloads from NASA PCoE
IMS Bearing (University of Cincinnati)
| Property |
Value |
| Source |
NASA PCoE |
| Domain |
Rolling element bearing run-to-failure |
| Signals |
Vibration from 4 bearings at 20 kHz |
| Task |
Anomaly detection / RUL estimation |
| Size |
~1 GB raw (3 run-to-failure tests) |
| Access |
Free download |
Paderborn University Bearing Dataset
| Property |
Value |
| Source |
KAt DataCenter |
| Domain |
Rolling element bearing (electrical motor) |
| Signals |
Vibration + motor current |
| Conditions |
32 (12 artificial + 14 real + 6 healthy) |
| Task |
Fault classification |
| Size |
~32 GB |
| Access |
Free — registration required |
Note: Paderborn requires following their specific train/test split protocol to produce comparable results with published baselines. See the paderborn_loader.py documentation.
CWRU Bearing Dataset
| Property |
Value |
| Source |
Case Western Reserve University |
| Domain |
Rolling element bearing |
| Signals |
Vibration at drive end and fan end |
| Conditions |
4 motor loads (0–3 HP); 4 fault diameters |
| Task |
Fault classification |
| Access |
Free download |
Note: Motor load conditions (0–3 HP) significantly affect signal characteristics. Published results must specify which load conditions were used in train/test split.
Time-Series AI Datasets
Used in: Industrial-Time-Series-AI
| Property |
Value |
| Source |
GitHub — ETDataset |
| Domain |
Power grid — electricity transformer |
| Subsets |
ETTh1, ETTh2 (hourly), ETTm1, ETTm2 (15-min) |
| Features |
7 (1 target + 6 covariates) |
| Task |
Multivariate long-horizon forecasting |
| Access |
Free download |
PSM (Pooled Server Metrics)
| Property |
Value |
| Domain |
Server infrastructure metrics |
| Features |
25 |
| Task |
Anomaly detection |
| Access |
Free — via download script |
SMAP / MSL (NASA Telemetry)
| Property |
Value |
| Source |
NASA JPL |
| Domain |
Spacecraft / Mars Science Laboratory telemetry |
| Features |
Multi-channel |
| Task |
Anomaly detection |
| Access |
Free download |
SWaT / WADI (Singapore iTrust)
| Property |
Value |
| Source |
iTrust, SUTD Singapore |
| Domain |
Secure Water Treatment / Water Distribution ICS |
| Features |
51 (SWaT) / 123 (WADI) |
| Task |
Anomaly detection + forecasting |
| Access |
Request required — data agreement with iTrust |
Note: SWaT and WADI require submitting a data-access request form to iTrust SUTD. The framework includes synthetic versions of both datasets to allow running benchmarks immediately without the access request.
# Synthetic versions — run immediately
python benchmarks/run_benchmark.py --task anomaly
# Real SWaT/WADI — after obtaining access
python datasets/download_datasets.py --datasets swat wadi
Power Electronics Datasets
Used in: AI-Power-Electronics-Diagnostics
Kaggle Electric Motor Temperature
| Property |
Value |
| Source |
Kaggle — wkirgsn |
| Domain |
Permanent Magnet Synchronous Motor (PMSM) drive |
| Features |
13 (motor currents, voltages, ambient temp, speed, torque) |
| Task |
Temperature prediction and thermal anomaly detection |
| Size |
~1.3M time steps (185 sessions) |
| Access |
Free — Kaggle account required |
| Reference |
Kirchgässner et al., IEEE IEMDC 2019 |
# Prerequisites: pip install kaggle and configure ~/.kaggle/kaggle.json
python datasets/download_scripts/setup_datasets.py --dataset motor_temp
Synthetic Inverter and Motor Drive Signals
| Property |
Value |
| Type |
Physics-informed simulation |
| Inverter faults |
9 classes (IGBT T1–T6 open, short, DC undervoltage, normal) |
| Motor faults |
5 classes (phase loss, ITSC, bearing, overtemp, normal) |
| Signals |
3-phase voltage and current waveforms |
| Access |
Built-in — generated with numpy/scipy, no download needed |
from datasets.synthetic import InverterFaultSimulator
sim = InverterFaultSimulator()
X, y = sim.generate_dataset(n_per_class=300, window_size=1024)
Smart Manufacturing Datasets
Used in: Smart-Manufacturing-AI
MVTec Anomaly Detection (MVTec AD)
| Property |
Value |
| Source |
MVTec Software |
| Domain |
Industrial surfaces and objects |
| Categories |
15 (bottle, cable, capsule, carpet, grid, hazelnut, leather, metal nut, pill, screw, tile, toothbrush, transistor, wood, zipper) |
| Images |
~5,354 (training + test) |
| Labels |
Binary (normal/defective) + pixel-level segmentation masks |
| Task |
Anomaly detection + defect segmentation |
| Size |
~4.9 GB |
| Access |
Non-commercial research license — see MVTec terms |
| Reference |
Bergmann et al., CVPR 2019 |
License note: MVTec AD is provided for non-commercial research use only and cannot be redistributed. The dataset loaders point to the official MVTec download page.
NEU Surface Defect Database
| Property |
Value |
| Source |
Northeastern University |
| Domain |
Hot-rolled steel strip surface defects |
| Classes |
6 (crazing, inclusion, patches, pitted surface, rolled-in scale, scratches) |
| Images |
1,800 (300 per class), 200 × 200 px grayscale |
| Task |
Defect classification |
| Size |
~36 MB |
| Access |
Free download |
Robot Sensor Data (Synthetic)
| Property |
Value |
| Type |
Simulated 6-DOF robot joint sensor streams |
| Channels |
12 (joint angles, velocities, torques) |
| Classes |
Normal / Fault |
| Access |
Built-in — generated locally, no download needed |
Dataset Access Summary
| Dataset |
Access Type |
Notes |
| NASA CMAPSS / N-CMAPSS |
Free |
Auto-download via loader |
| IMS Bearing |
Free |
Manual download from NASA PCoE |
| Paderborn Bearing |
Free + registration |
Follow their split protocol |
| CWRU Bearing |
Free |
Specify load condition in results |
| ETT |
Free |
Auto-download via script |
| PSM / SMAP / MSL |
Free |
Via download script |
| SWaT / WADI |
Request required |
iTrust SUTD data agreement |
| Kaggle Motor Temp |
Free |
Kaggle account + API key |
| MVTec AD |
Research license |
Non-commercial only |
| NEU Surface Defect |
Free |
Direct download |
| Synthetic (PED + SM) |
Built-in |
No download needed |