أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Jonathan M. Garibaldi

A Novel Meta Learning Framework for Feature Selection using Data Synthesis and Fuzzy Similarity

94 - Zixiao Shen , Xin Chen , Jonathan M. Garibaldi 2020

This paper presents a novel meta learning framework for feature selection (FS) based on fuzzy similarity. The proposed method aims to recommend the best FS method from four candidate FS methods for any given dataset. This is achieved by firstly const ructing a large training data repository using data synthesis. Six meta features that represent the characteristics of the training dataset are then extracted. The best FS method for each of the training datasets is used as the meta label. Both the meta features and the corresponding meta labels are subsequently used to train a classification model using a fuzzy similarity measure based framework. Finally the trained model is used to recommend the most suitable FS method for a given unseen dataset. This proposed method was evaluated based on eight public datasets of real-world applications. It successfully recommended the best method for five datasets and the second best method for one dataset, which outperformed any of the four individual FS methods. Besides, the proposed method is computationally efficient for algorithm selection, leading to negligible additional time for the feature selection process. Thus, the paper contributes a novel method for effectively recommending which feature selection method to use for any new given dataset.

التعلم الآلي التعلم الالي

A Novel Weighted Combination Method for Feature Selection using Fuzzy Sets

97 - Zixiao Shen , Xin Chen , Jonathan M. Garibaldi 2020

In this paper, we propose a novel weighted combination feature selection method using bootstrap and fuzzy sets. The proposed method mainly consists of three processes, including fuzzy sets generation using bootstrap, weighted combination of fuzzy set s and feature ranking based on defuzzification. We implemented the proposed method by combining four state-of-the-art feature selection methods and evaluated the performance based on three publicly available biomedical datasets using five-fold cross validation. Based on the feature selection results, our proposed method produced comparable (if not better) classification accuracies to the best of the individual feature selection methods for all evaluated datasets. More importantly, we also applied standard deviation and Pearsons correlation to measure the stability of the methods. Remarkably, our combination method achieved significantly higher stability than the four individual methods when variations and size reductions were introduced to the datasets.

التعلم الآلي التعلم الالي

Performance Optimization of a Fuzzy Entropy based Feature Selection and Classification Framework

86 - Zixiao Shen , Xin Chen , Jonathan M. Garibaldi 2020

In this paper, based on a fuzzy entropy feature selection framework, different methods have been implemented and compared to improve the key components of the framework. Those methods include the combinations of three ideal vector calculations, three maximal similarity classifiers and three fuzzy entropy functions. Different feature removal orders based on the fuzzy entropy values were also compared. The proposed method was evaluated on three publicly available biomedical datasets. From the experiments, we concluded the optimized combination of the ideal vector, similarity classifier and fuzzy entropy function for feature selection. The optimized framework was also compared with other six classical filter-based feature selection methods. The proposed method was ranked as one of the top performers together with the Correlation and ReliefF methods. More importantly, the proposed method achieved the most stable performance for all three datasets when the features being gradually removed. This indicates a better feature ranking performance than the other compared methods.

التعلم الآلي التعلم الالي

Supervised Adverse Drug Reaction Signalling Framework Imitating Bradford Hills Causality Considerations

257 - Jenna Marie Reps , Jonathan M. Garibaldi , Uwe Aickelin 2016

Big longitudinal observational medical data potentially hold a wealth of information and have been recognised as potential sources for gaining new drug safety knowledge. Unfortunately there are many complexities and underlying issues when analysing l ongitudinal observational data. Due to these complexities, existing methods for large-scale detection of negative side effects using observational data all tend to have issues distinguishing between association and causality. New methods that can better discriminate causal and non-causal relationships need to be developed to fully utilise the data. In this paper we propose using a set of causality considerations developed by the epidemiologist Bradford Hill as a basis for engineering features that enable the application of supervised learning for the problem of detecting negative side effects. The Bradford Hill considerations look at various perspectives of a drug and outcome relationship to determine whether it shows causal traits. We taught a classifier to find patterns within these perspectives and it learned to discriminate between association and causality. The novelty of this research is the combination of supervised learning and Bradford Hills causality considerations to automate the Bradford Hills causality assessment. We evaluated the framework on a drug safety gold standard know as the observational medical outcomes partnerships nonspecified association reference set. The methodology obtained excellent discriminate ability with area under the curves ranging between 0.792-0.940 (existing method optimal: 0.73) and a mean average precision of 0.640 (existing method optimal: 0.141). The proposed features can be calculated efficiently and be readily updated, making the framework suitable for big observational data.

الذكاء الاصطناعي

Adaptive Data Communication Interface: A User-Centric Visual Data Interpretation Framework

118 - Grazziela P. Figueredo , Christian Wagner , Jonathan M. Garibaldi 2016

In this position paper, we present ideas about creating a next generation framework towards an adaptive interface for data communication and visualisation systems. Our objective is to develop a system that accepts large data sets as inputs and provid es user-centric, meaningful visual information to assist owners to make sense of their data collection. The proposed framework comprises four stages: (i) the knowledge base compilation, where we search and collect existing state-ofthe-art visualisation techniques per domain and user preferences; (ii) the development of the learning and inference system, where we apply artificial intelligence techniques to learn, predict and recommend new graphic interpretations (iii) results evaluation; and (iv) reinforcement and adaptation, where valid outputs are stored in our knowledge base and the system is iteratively tuned to address new demands. These stages, as well as our overall vision, limitations and possible challenges are introduced in this article. We also discuss further extensions of this framework for other knowledge discovery tasks.

تفاعل الإنسان والحاسوب

Augmented Neural Networks for Modelling Consumer Indebtness

111 - Alexandros Ladas , Jonathan M. Garibaldi , Rodrigo Scarpel 2014

Consumer Debt has risen to be an important problem of modern societies, generating a lot of research in order to understand the nature of consumer indebtness, which so far its modelling has been carried out by statistical models. In this work we show that Computational Intelligence can offer a more holistic approach that is more suitable for the complex relationships an indebtness dataset has and Linear Regression cannot uncover. In particular, as our results show, Neural Networks achieve the best performance in modelling consumer indebtness, especially when they manage to incorporate the significant and experimentally verified results of the Data Mining process in the model, exploiting the flexibility Neural Networks offer in designing their topology. This novel method forms an elaborate framework to model Consumer indebtness that can be extended to any other real world application.

الهندسة الحاسوبية، المالية،العلوم التعلم الآلي الحوسبة العصبية والتطورية

Tuning a Multiple Classifier System for Side Effect Discovery using Genetic Algorithms

100 - Jenna M. Reps , Uwe Aickelin , Jonathan M. Garibaldi 2014

In previous work, a novel supervised framework implementing a binary classifier was presented that obtained excellent results for side effect discovery. Interestingly, unique side effects were identified when different binary classifiers were used wi thin the framework, prompting the investigation of applying a multiple classifier system. In this paper we investigate tuning a side effect multiple classifying system using genetic algorithms. The results of this research show that the novel framework implementing a multiple classifying system trained using genetic algorithms can obtain a higher partial area under the receiver operating characteristic curve than implementing a single classifier. Furthermore, the framework is able to detect side effects efficiently and obtains a low false positive rate.

التعلم الآلي الهندسة الحاسوبية، المالية،العلوم

Attributes for Causal Inference in Longitudinal Observational Databases

417 - Jenna Reps , Jonathan M. Garibaldi , Uwe Aickelin 2014

The pharmaceutical industry is plagued by the problem of side effects that can occur anytime a prescribed medication is ingested. There has been a recent interest in using the vast quantities of medical data available in longitudinal observational da tabases to identify causal relationships between drugs and medical events. Unfortunately the majority of existing post marketing surveillance algorithms measure how dependant or associated an event is on the presence of a drug rather than measuring causality. In this paper we investigate potential attributes that can be used in causal inference to identify side effects based on the Bradford-Hill causality criteria. Potential attributes are developed by considering five of the causality criteria and feature selection is applied to identify the most suitable of these attributes for detecting side effects. We found that attributes based on the specificity criterion may improve side effect signalling algorithms but the experiment and dosage criteria attributes investigated in this paper did not offer sufficient additional information.

الهندسة الحاسوبية، المالية،العلوم التعلم الآلي

Signalling Paediatric Side Effects using an Ensemble of Simple Study Designs

64 - Jenna M. Reps , Jonathan M. Garibaldi , Uwe Aickelin 2014

Background: Children are frequently prescribed medication off-label, meaning there has not been sufficient testing of the medication to determine its safety or effectiveness. The main reason this safety knowledge is lacking is due to ethical restrict ions that prevent children from being included in the majority of clinical trials. Objective: The objective of this paper is to investigate whether an ensemble of simple study designs can be implemented to signal acutely occurring side effects effectively within the paediatric population by using historical longitudinal data. The majority of pharmacovigilance techniques are unsupervised, but this research presents a supervised framework. Methods: Multiple measures of association are calculated for each drug and medical event pair and these are used as features that are fed into a classiffier to determine the likelihood of the drug and medical event pair corresponding to an adverse drug reaction. The classiffier is trained using known adverse drug reactions or known non-adverse drug reaction relationships. Results: The novel ensemble framework obtained a false positive rate of 0:149, a sensitivity of 0:547 and a specificity of 0:851 when implemented on a reference set of drug and medical event pairs. The novel framework consistently outperformed each individual simple study design. Conclusion: This research shows that it is possible to exploit the mechanism of causality and presents a framework for signalling adverse drug reactions effectively.

التعلم الآلي الهندسة الحاسوبية، المالية،العلوم

A Novel Semi-Supervised Algorithm for Rare Prescription Side Effect Discovery

85 - Jenna Reps , Jonathan M. Garibaldi , Uwe Aickelin 2014

Drugs are frequently prescribed to patients with the aim of improving each patients medical state, but an unfortunate consequence of most prescription drugs is the occurrence of undesirable side effects. Side effects that occur in more than one in a thousand patients are likely to be signalled efficiently by current drug surveillance methods, however, these same methods may take decades before generating signals for rarer side effects, risking medical morbidity or mortality in patients prescribed the drug while the rare side effect is undiscovered. In this paper we propose a novel computational meta-analysis framework for signalling rare side effects that integrates existing methods, knowledge from the web, metric learning and semi-supervised clustering. The novel framework was able to signal many known rare and serious side effects for the selection of drugs investigated, such as tendon rupture when prescribed Ciprofloxacin or Levofloxacin, renal failure with Naproxen and depression associated with Rimonabant. Furthermore, for the majority of the drug investigated it generated signals for rare side effects at a more stringent signalling threshold than existing methods and shows the potential to become a fundamental part of post marketing surveillance to detect rare side effects.

التعلم الآلي الهندسة الحاسوبية، المالية،العلوم

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد