ترغب بنشر مسار تعليمي؟ اضغط هنا

Towards an Interpretable Data-driven Trigger System for High-throughput Physics Facilities

53   0   0.0 ( 0 )
 نشر من قبل Yuxin Chen
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Data-intensive science is increasingly reliant on real-time processing capabilities and machine learning workflows, in order to filter and analyze the extreme volumes of data being collected. This is especially true at the energy and intensity frontiers of particle physics where bandwidths of raw data can exceed 100 Tb/s of heterogeneous, high-dimensional data sourced from hundreds of millions of individual sensors. In this paper, we introduce a new data-driven approach for designing and optimizing high-throughput data filtering and trigger systems such as those in use at physics facilities like the Large Hadron Collider (LHC). Concretely, our goal is to design a data-driven filtering system with a minimal run-time cost for determining which data event to keep, while preserving (and potentially improving upon) the distribution of the output as generated by the hand-designed trigger system. We introduce key insights from interpretable predictive modeling and cost-sensitive learning in order to account for non-local inefficiencies in the current paradigm and construct a cost-effective data filtering and trigger model that does not compromise physics coverage.



قيم البحث

اقرأ أيضاً

The use of sophisticated machine learning models for critical decision making is faced with a challenge that these models are often applied as a black-box. This has led to an increased interest in interpretable machine learning, where post hoc interp retation presents a useful mechanism for generating interpretations of complex learning models. In this paper, we propose a novel approach underpinned by an extended framework of Bayesian networks for generating post hoc interpretations of a black-box predictive model. The framework supports extracting a Bayesian network as an approximation of the black-box model for a specific prediction. Compared to the existing post hoc interpretation methods, the contribution of our approach is three-fold. Firstly, the extracted Bayesian network, as a probabilistic graphical model, can provide interpretations about not only what input features but also why these features contributed to a prediction. Secondly, for complex decision problems with many features, a Markov blanket can be generated from the extracted Bayesian network to provide interpretations with a focused view on those input features that directly contributed to a prediction. Thirdly, the extracted Bayesian network enables the identification of four different rules which can inform the decision-maker about the confidence level in a prediction, thus helping the decision-maker assess the reliability of predictions learned by a black-box model. We implemented the proposed approach, applied it in the context of two well-known public datasets and analysed the results, which are made available in an open-source repository.
High-Throughput materials discovery involves the rapid synthesis, measurement, and characterization of many different but structurally-related materials. A key problem in materials discovery, the phase map identification problem, involves the determi nation of the crystal phase diagram from the materials composition and structural characterization data. We present Phase-Mapper, a novel AI platform to solve the phase map identification problem that allows humans to interact with both the data and products of AI algorithms, including the incorporation of human feedback to constrain or initialize solutions. Phase-Mapper affords incorporation of any spectral demixing algorithm, including our novel solver, AgileFD, which is based on a convolutive non-negative matrix factorization algorithm. AgileFD can incorporate constraints to capture the physics of the materials as well as human feedback. We compare three solver variants with previously proposed methods in a large-scale experiment involving 20 synthetic systems, demonstrating the efficacy of imposing physical constrains using AgileFD. Phase-Mapper has also been used by materials scientists to solve a wide variety of phase diagrams, including the previously unsolved Nb-Mn-V oxide system, which is provided here as an illustrative example.
Neural embedding-based machine learning models have shown promise for predicting novel links in knowledge graphs. Unfortunately, their practical utility is diminished by their lack of interpretability. Recently, the fully interpretable, rule-based al gorithm AnyBURL yielded highly competitive results on many general-purpose link prediction benchmarks. However, current approaches for aggregating predictions made by multiple rules are affected by redundancies. We improve upon AnyBURL by introducing the SAFRAN rule application framework, which uses a novel aggregation approach called Non-redundant Noisy-OR that detects and clusters redundant rules prior to aggregation. SAFRAN yields new state-of-the-art results for fully interpretable link prediction on the established general-purpose benchmarks FB15K-237, WN18RR and YAGO3-10. Furthermore, it exceeds the results of multiple established embedding-based algorithms on FB15K-237 and WN18RR and narrows the gap between rule-based and embedding-based algorithms on YAGO3-10.
116 - Mark Strikman 2007
We outline several directions for future investigations of the three-dimensional structure of nucleon, including multiparton correlations, color transparency, and branching processes at hadron colliders and at hadron factories. We also find evidence that pQCD regime for non-vacuum Regge trajectories sets in for $-tge 1 {GeV}^2$ leading to nearly t-independent trajectories.
Cardiovascular diseases and heart failures in particular are the main cause of non-communicable disease mortality in the world. Constant patient monitoring enables better medical treatment as it allows practitioners to react on time and provide the a ppropriate treatment. Telemedicine can provide constant remote monitoring so patients can stay in their homes, only requiring medical sensing equipment and network connections. A limiting factor for telemedical centers is the amount of patients that can be monitored simultaneously. We aim to increase this amount by implementing a decision support system. This paper investigates a machine learning model to estimate a risk score based on patient vital parameters that allows sorting all cases every day to help practitioners focus their limited capacities on the most severe cases. The model we propose reaches an AUCROC of 0.84, whereas the baseline rule-based model reaches an AUCROC of 0.73. Our results indicate that the usage of deep learning to improve the efficiency of telemedical centers is feasible. This way more patients could benefit from better health-care through remote monitoring.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا