No Arabic abstract
Genetic programming (GP) is the state-of-the-art in financial automated feature construction task. It employs reverse polish expression to represent features and then conducts the evolution process. However, with the development of deep learning, more powerful feature extraction tools are available. This paper proposes Alpha Discovery Neural Network (ADNN), a tailored neural network structure which can automatically construct diversified financial technical indicators based on prior knowledge. We mainly made three contributions. First, we use domain knowledge in quantitative trading to design the sampling rules and object function. Second, pre-training and model pruning has been used to replace genetic programming, because it can conduct more efficient evolution process. Third, the feature extractors in ADNN can be replaced by different feature extractors and produce different functions. The experiment results show that ADNN can construct more informative and diversified features than GP, which can effectively enriches the current factor pool. The fully-connected network and recurrent network are better at extracting information from the financial time series than the convolution neural network. In real practice, features constructed by ADNN can always improve multi-factor strategies revenue, sharpe ratio, and max draw-down, compared with the investment strategies without these factors.
Instead of conducting manual factor construction based on traditional and behavioural finance analysis, academic researchers and quantitative investment managers have leveraged Genetic Programming (GP) as an automatic feature construction tool in recent years, which builds reverse polish mathematical expressions from trading data into new factors. However, with the development of deep learning, more powerful feature extraction tools are available. This paper proposes Neural Network-based Automatic Factor Construction (NNAFC), a tailored neural network framework that can automatically construct diversified financial factors based on financial domain knowledge and a variety of neural network structures. The experiment results show that NNAFC can construct more informative and diversified factors than GP, to effectively enrich the current factor pool. For the current market, both fully connected and recurrent neural network structures are better at extracting information from financial time series than convolution neural network structures. Moreover, new factors constructed by NNAFC can always improve the return, Sharpe ratio, and the max draw-down of a multi-factor quantitative investment strategy due to their introducing more information and diversification to the existing factor pool.
We propose a method for gene expression based analysis of cancer phenotypes incorporating network biology knowledge through unsupervised construction of computational graphs. The structural construction of the computational graphs is driven by the use of topological clustering algorithms on protein-protein networks which incorporate inductive biases stemming from network biology research in protein complex discovery. This structurally constrains the hypothesis space over the possible computational graph factorisation whose parameters can then be learned through supervised or unsupervised task settings. The sparse construction of the computational graph enables the differential protein complex activity analysis whilst also interpreting the individual contributions of genes/proteins involved in each individual protein complex. In our experiments analysing a variety of cancer phenotypes, we show that the proposed methods outperform SVM, Fully-Connected MLP, and Randomly-Connected MLPs in all tasks. Our work introduces a scalable method for incorporating large interaction networks as prior knowledge to drive the construction of powerful computational models amenable to introspective study.
In recent years, Bitcoin price prediction has attracted the interest of researchers and investors. However, the accuracy of previous studies is not well enough. Machine learning and deep learning methods have been proved to have strong prediction ability in this area. This paper proposed a method combined with Ensemble Empirical Mode Decomposition (EEMD) and a deep learning method called long short-term memory (LSTM) to research the problem of next-day Bitcoin price forecast.
Chemical reactions occur in energy, environmental, biological, and many other natural systems, and the inference of the reaction networks is essential to understand and design the chemical processes in engineering and life sciences. Yet, revealing the reaction pathways for complex systems and processes is still challenging due to the lack of knowledge of the involved species and reactions. Here, we present a neural network approach that autonomously discovers reaction pathways from the time-resolved species concentration data. The proposed Chemical Reaction Neural Network (CRNN), by design, satisfies the fundamental physics laws, including the Law of Mass Action and the Arrhenius Law. Consequently, the CRNN is physically interpretable such that the reaction pathways can be interpreted, and the kinetic parameters can be quantified simultaneously from the weights of the neural network. The inference of the chemical pathways is accomplished by training the CRNN with species concentration data via stochastic gradient descent. We demonstrate the successful implementations and the robustness of the approach in elucidating the chemical reaction pathways of several chemical engineering and biochemical systems. The autonomous inference by the CRNN approach precludes the need for expert knowledge in proposing candidate networks and addresses the curse of dimensionality in complex systems. The physical interpretability also makes the CRNN capable of not only fitting the data for a given system but also developing knowledge of unknown pathways that could be generalized to similar chemical systems.
Stock price prediction is a challenging task, but machine learning methods have recently been used successfully for this purpose. In this paper, we extract over 270 hand-crafted features (factors) inspired by technical and quantitative analysis and tested their validity on short-term mid-price movement prediction. We focus on a wrapper feature selection method using entropy, least-mean squares, and linear discriminant analysis. We also build a new quantitative feature based on adaptive logistic regression for online learning, which is constantly selected first among the majority of the proposed feature selection methods. This study examines the best combination of features using high frequency limit order book data from Nasdaq Nordic. Our results suggest that sorting methods and classifiers can be used in such a way that one can reach the best performance with a combination of only very few advanced hand-crafted features.