No Arabic abstract
Assessment of mental workload in real world conditions is key to ensure the performance of workers executing tasks which demand sustained attention. Previous literature has employed electroencephalography (EEG) to this end. However, EEG correlates of mental workload vary across subjects and physical strain, thus making it difficult to devise models capable of simultaneously presenting reliable performance across users. The field of domain adaptation (DA) aims at developing methods that allow for generalization across different domains by learning domain-invariant representations. Such DA methods, however, rely on the so-called covariate shift assumption, which typically does not hold for EEG-based applications. As such, in this paper we propose a way to measure the statistical (marginal and conditional) shift observed on data obtained from different users and use this measure to quantitatively assess the effectiveness of different adaptation strategies. In particular, we use EEG data collected from individuals performing a mental task while running in a treadmill and explore the effects of different normalization strategies commonly used to mitigate cross-subject variability. We show the effects that different normalization schemes have on statistical shifts and their relationship with the accuracy of mental workload prediction as assessed on unseen participants at train time.
Electroencephalography (EEG) is a complex signal and can require several years of training to be correctly interpreted. Recently, deep learning (DL) has shown great promise in helping make sense of EEG signals due to its capacity to learn good feature representations from raw data. Whether DL truly presents advantages as compared to more traditional EEG processing approaches, however, remains an open question. In this work, we review 156 papers that apply DL to EEG, published between January 2010 and July 2018, and spanning different application domains such as epilepsy, sleep, brain-computer interfacing, and cognitive and affective monitoring. We extract trends and highlight interesting approaches in order to inform future research and formulate recommendations. Various data items were extracted for each study pertaining to 1) the data, 2) the preprocessing methodology, 3) the DL design choices, 4) the results, and 5) the reproducibility of the experiments. Our analysis reveals that the amount of EEG data used across studies varies from less than ten minutes to thousands of hours. As for the model, 40% of the studies used convolutional neural networks (CNNs), while 14% used recurrent neural networks (RNNs), most often with a total of 3 to 10 layers. Moreover, almost one-half of the studies trained their models on raw or preprocessed EEG time series. Finally, the median gain in accuracy of DL approaches over traditional baselines was 5.4% across all relevant studies. More importantly, however, we noticed studies often suffer from poor reproducibility: a majority of papers would be hard or impossible to reproduce given the unavailability of their data and code. To help the field progress, we provide a list of recommendations for future studies and we make our summary table of DL and EEG papers available and invite the community to contribute.
Datasets for biosignals, such as electroencephalogram (EEG) and electrocardiogram (ECG), often have noisy labels and have limited number of subjects (<100). To handle these challenges, we propose a self-supervised approach based on contrastive learning to model biosignals with a reduced reliance on labeled data and with fewer subjects. In this regime of limited labels and subjects, intersubject variability negatively impacts model performance. Thus, we introduce subject-aware learning through (1) a subject-specific contrastive loss, and (2) an adversarial training to promote subject-invariance during the self-supervised learning. We also develop a number of time-series data augmentation techniques to be used with the contrastive loss for biosignals. Our method is evaluated on publicly available datasets of two different biosignals with different tasks: EEG decoding and ECG anomaly detection. The embeddings learned using self-supervision yield competitive classification results compared to entirely supervised methods. We show that subject-invariance improves representation quality for these tasks, and observe that subject-specific loss increases performance when fine-tuning with supervised labels.
We consider a general statistical estimation problem wherein binary labels across different observations are not independent conditioned on their feature vectors, but dependent, capturing settings where e.g. these observations are collected on a spatial domain, a temporal domain, or a social network, which induce dependencies. We model these dependencies in the language of Markov Random Fields and, importantly, allow these dependencies to be substantial, i.e do not assume that the Markov Random Field capturing these dependencies is in high temperature. As our main contribution we provide algorithms and statistically efficient estimation rates for this model, giving several instantiations of our bounds in logistic regression, sparse logistic regression, and neural network settings with dependent data. Our estimation guarantees follow from novel results for estimating the parameters (i.e. external fields and interaction strengths) of Ising models from a {em single} sample. {We evaluate our estimation approach on real networked data, showing that it outperforms standard regression approaches that ignore dependencies, across three text classification datasets: Cora, Citeseer and Pubmed.}
In modern building infrastructures, the chance to devise adaptive and unsupervised data-driven health monitoring systems is gaining in popularity due to the large availability of data from low-cost sensors with internetworking capabilities. In particular, deep learning provides the tools for processing and analyzing this unprecedented amount of data efficiently. The main purpose of this paper is to combine the recent advances of Deep Learning (DL) and statistical analysis on structural health monitoring (SHM) to develop an accurate classification tool able to discriminate among different acoustic emission events (cracks) by means of the identification of tensile, shear and mixed modes. The applications of DL in SHM systems is described by using the concept of Bidirectional Long Short Term Memory. We investigated on effective event descriptors to capture the unique characteristics from the different types of modes. Among them, Spectral Kurtosis and Spectral L2/L1 Norm exhibit distinctive behavior and effectively contributed to the learning process. This classification will contribute to unambiguously detect incipient damages, which is advantageous to realize predictive maintenance. Tests on experimental results confirm that this method achieves accurate classification (92%) capabilities of crack events and can impact on the design of future SHM technologies.
Photoplethysmogram (PPG) is increasingly used to provide monitoring of the cardiovascular system under ambulatory conditions. Wearable devices like smartwatches use PPG to allow long term unobtrusive monitoring of heart rate in free living conditions. PPG based heart rate measurement is unfortunately highly susceptible to motion artifacts, particularly when measured from the wrist. Traditional machine learning and deep learning approaches rely on tri-axial accelerometer data along with PPG to perform heart rate estimation. The conventional learning based approaches have not addressed the need for device-specific modeling due to differences in hardware design among PPG devices. In this paper, we propose a novel end to end deep learning model to perform heart rate estimation using 8 second length input PPG signal. We evaluate the proposed model on the IEEE SPC 2015 dataset, achieving a mean absolute error of 3.36+-4.1BPM for HR estimation on 12 subjects without requiring patient specific training. We also studied the feasibility of applying transfer learning along with sparse retraining from a comprehensive in house PPG dataset for heart rate estimation across PPG devices with different hardware design.