No Arabic abstract
Earthquakes can be detected by matching spatial patterns or phase properties from 1-D seismic waves. Current earthquake detection methods, such as waveform correlation and template matching, have difficulty detecting anomalous earthquakes that are not similar to other earthquakes. In recent years, machine-learning techniques for earthquake detection have been emerging as a new active research direction. In this paper, we develop a novel earthquake detection method based on dictionary learning. Our detection method first generates rich features via signal processing and statistical methods and further employs feature selection techniques to choose features that carry the most significant information. Based on these selected features, we build a dictionary for classifying earthquake events from non-earthquake events. To evaluate the performance of our dictionary-based detection methods, we test our method on a labquake dataset from Penn State University, which contains 3,357,566 time series data points with a 400 MHz sampling rate. 1,000 earthquake events are manually labeled in total, and the length of these earthquake events varies from 74 to 7151 data points. Through comparison to other detection methods, we show that our feature selection and dictionary learning incorporated earthquake detection method achieves an 80.1% prediction accuracy and outperforms the baseline methods in earthquake detection, including Template Matching (TM) and Support Vector Machine (SVM).
Recent observation studies have revealed that earthquakes are classified into several different categories. Each category might be characterized by the unique statistical feature in the time series, but the present understanding is still limited due to their nonlinear and nonstationary nature. Here we utilize complex network theory to shed new light on the statistical properties of earthquake time series. We investigate two kinds of time series, which are magnitude and inter-event time (IET), for three different categories of earthquakes: regular earthquakes, earthquake swarms, and tectonic tremors. Following the criterion of visibility graph, earthquake time series are mapped into a complex network by considering each seismic event as a node and determining the links. As opposed to the current common belief, it is found that the magnitude time series are not statistically equivalent to random time series. The IET series exhibit correlations similar to fractional Brownian motion for all the categories of earthquakes. Furthermore, we show that the time series of three different categories of earthquakes can be distinguished by the topology of the associated visibility graph. Analysis on the assortativity coefficient also reveals that the swarms are more intermittent than the tremors.
Seismic data quality is vital to geophysical applications, so methods of data recovery, including denoising and interpolation, are common initial steps in the seismic data processing flow. We present a method to perform simultaneous interpolation and denoising, which is based on double-sparsity dictionary learning. This extends previous work that was for denoising only. The original double sparsity dictionary learning algorithm is modified to track the traces with missing data by defining a masking operator that is integrated into the sparse representation of the dictionary. A weighted low-rank approximation algorithm is adopted to handle the dictionary updating as a sparse recovery optimization problem constrained by the masking operator. Compared to traditional sparse transforms with fixed dictionaries that lack the ability to adapt to complex data structures, the double-sparsity dictionary learning method learns the signal adaptively from selected patches of the corrupted seismic data while preserving compact forward and inverse transform operators. Numerical experiments on synthetic seismic data indicate that this new method preserves more subtle features in the dataset without introducing pseudo-Gibbs artifacts when compared to other directional multiscale transform methods such as curvelets.
There exist many high-dimensional data in real-world applications such as biology, computer vision, and social networks. Feature selection approaches are devised to confront with high-dimensional data challenges with the aim of efficient learning technologies as well as reduction of models complexity. Due to the hardship of labeling on these datasets, there are a variety of approaches on feature selection process in an unsupervised setting by considering some important characteristics of data. In this paper, we introduce a novel unsupervised feature selection approach by applying dictionary learning ideas in a low-rank representation. Dictionary learning in a low-rank representation not only enables us to provide a new representation, but it also maintains feature correlation. Then, spectral analysis is employed to preserve sample similarities. Finally, a unified objective function for unsupervised feature selection is proposed in a sparse way by an $ell_{2,1}$-norm regularization. Furthermore, an efficient numerical algorithm is designed to solve the corresponding optimization problem. We demonstrate the performance of the proposed method based on a variety of standard datasets from different applied domains. Our experimental findings reveal that the proposed method outperforms the state-of-the-art algorithm.
Longitudinal Dispersion(LD) is the dominant process of scalar transport in natural streams. An accurate prediction on LD coefficient(Dl) can produce a performance leap in related simulation. The emerging machine learning(ML) techniques provide a self-adaptive tool for this problem. However, most of the existing studies utilize an unproved quaternion feature set, obtained through simple theoretical deduction. Few studies have put attention on its reliability and rationality. Besides, due to the lack of comparative comparison, the proper choice of ML models in different scenarios still remains unknown. In this study, the Feature Gradient selector was first adopted to distill the local optimal feature sets directly from multivariable data. Then, a global optimal feature set (the channel width, the flow velocity, the channel slope and the cross sectional area) was proposed through numerical comparison of the distilled local optimums in performance with representative ML models. The channel slope is identified to be the key parameter for the prediction of LDC. Further, we designed a weighted evaluation metric which enables comprehensive model comparison. With the simple linear model as the baseline, a benchmark of single and ensemble learning models was provided. Advantages and disadvantages of the methods involved were also discussed. Results show that the support vector machine has significantly better performance than other models. Decision tree is not suitable for this problem due to poor generalization ability. Notably, simple models show superiority over complicated model on this low-dimensional problem, for their better balance between regression and generalization.
We present a novel approach for resolving modes of rupture directivity in large populations of earthquakes. A seismic spectral decomposition technique is used to first produce relative measurements of radiated energy for earthquakes in a spatially-compact cluster. The azimuthal distribution of energy for each earthquake is then assumed to result from one of several distinct modes of rupture propagation. Rather than fitting a kinematic rupture model to determine the most likely mode of rupture propagation, we instead treat the modes as latent variables and learn them with a Gaussian mixture model. The mixture model simultaneously determines the number of events that best identify with each mode. The technique is demonstrated on four datasets in California with several thousand earthquakes. We show that the datasets naturally decompose into distinct rupture propagation modes that correspond to different rupture directions, and the fault plane is unambiguously identified for all cases. We find that these small earthquakes exhibit unilateral ruptures 53-74% of the time on average. The results provide important observational constraints on the physics of earthquakes and faults.