Electroencephalogram (EEG) data is crucial for diagnosing mental health
conditions but is costly and time-consuming to collect at scale. Synthetic data
generation offers a promising solution to augment datasets for machine learning
applications. Tuttavia, generating high-quality synthetic EEG that preserves
emotional and mental health signals remains challenging. This study proposes a
method combining correlation analysis and random sampling to generate realistic
synthetic EEG data.
We first analyze interdependencies between EEG frequency bands using
correlation analysis. Guided by this structure, we generate synthetic samples
via random sampling. Samples with high correlation to real data are retained
and evaluated through distribution analysis and classification tasks. A Random
Forest model trained to distinguish synthetic from real EEG performs at chance
level, indicating high fidelity.
The generated synthetic data closely match the statistical and structural
properties of the original EEG, with similar correlation coefficients and no
significant differences in PERMANOVA tests. This method provides a scalable,
privacy-preserving approach for augmenting EEG datasets, enabling more
efficient model training in mental health research.
Questo articolo esplora i giri e le loro implicazioni.
Scarica PDF:
2504.16143v1