ترغب بنشر مسار تعليمي؟ اضغط هنا

328 - Daniel C. Elton 2020
Artificial intelligence has made great strides since the deep learning revolution, but AI systems still struggle to extrapolate outside of their training data and adapt to new situations. For inspiration we look to the domain of science, where scient ists have been able to develop theories which show remarkable ability to extrapolate and sometimes predict the existence of phenomena which have never been observed before. According to David Deutsch, this type of extrapolation, which he calls reach, is due to scientific theories being hard to vary. In this work we investigate Deutschs hard-to-vary principle and how it relates to more formalized principles in deep learning such as the bias-variance trade-off and Occams razor. We distinguish internal variability, how much a model/theory can be varied internally while still yielding the same predictions, with external variability, which is how much a model must be varied to accurately predict new, out-of-distribution data. We discuss how to measure internal variability using the size of the Rashomon set and how to measure external variability using Kolmogorov complexity. We explore what role hard-to-vary explanations play in intelligence by looking at the human brain and distinguish two learning systems in the brain. The first system operates similar to deep learning and likely underlies most of perception and motor control while the second is a more creative system capable of generating hard-to-vary explanations of the world. We argue that figuring out how replicate this second system, which is capable of generating hard-to-vary explanations, is a key challenge which needs to be solved in order to realize artificial general intelligence. We make contact with the framework of Popperian epistemology which rejects induction and asserts that knowledge generation is an evolutionary process which proceeds through conjecture and refutation.
Pathological science occurs when well-intentioned scientists spend extended time and resources studying a phenomena that isnt real. Researchers who get caught up in pathological science are usually following the scientific method and performing caref ul experiments, but they get tricked by nature. The study of water has had several protracted episodes of pathological science, a few of which are still ongoing. We discuss four areas of pathological water science -polywater, the Mpemba effect, Pollacks fourth phase of water, and the effects of static magnetic fields on water. Some common water-specific issues emerge such as the contamination and confounding of experiments with dissolved solutes and nanobubbles. General issues also emerge such as imprecision in defining what is being studied, bias towards confirmation rather than falsification, and poor standards for reproducibility. We hope this work helps researchers avoid wasting valuable time and resources pursuing pathological science.
We present a novel method for small bowel segmentation where a cylindrical topological constraint based on persistent homology is applied. To address the touching issue which could break the applied constraint, we propose to augment a network with an additional branch to predict an inner cylinder of the small bowel. Since the inner cylinder is free of the touching issue, a cylindrical shape constraint applied on this augmented branch guides the network to generate a topologically correct segmentation. For strict evaluation, we achieved an abdominal computed tomography dataset with dense segmentation ground-truths. The proposed method showed clear improvements in terms of four different metrics compared to the baseline method, and also showed the statistical significance from a paired t-test.
Calcified plaque in the aorta and pelvic arteries is associated with coronary artery calcification and is a strong predictor of heart attack. Current calcified plaque detection models show poor generalizability to different domains (ie. pre-contrast vs. post-contrast CT scans). Many recent works have shown how cross domain object detection can be improved using an image translation model which translates between domains using a single shared latent space. However, while current image translation models do a good job preserving global/intermediate level structures they often have trouble preserving tiny structures. In medical imaging applications, preserving small structures is important since these structures can carry information which is highly relevant for disease diagnosis. Recent works on image reconstruction show that complex real-world images are better reconstructed using a union of subspaces approach. Since small image patches are used to train the image translation model, it makes sense to enforce that each patch be represented by a linear combination of subspaces which may correspond to the different parts of the body present in that patch. Motivated by this, we propose an image translation network using a shared union of subspaces constraint and show our approach preserves subtle structures (plaques) better than the conventional method. We further applied our method to a cross domain plaque detection task and show significant improvement compared to the state-of-the art method.
136 - Daniel C. Elton 2020
The ability to explain decisions made by AI systems is highly sought after, especially in domains where human lives are at stake such as medicine or autonomous vehicles. While it is often possible to approximate the input-output relations of deep neu ral networks with a few human-understandable rules, the discovery of the double descent phenomena suggests that such approximations do not accurately capture the mechanism by which deep neural networks work. Double descent indicates that deep neural networks typically operate by smoothly interpolating between data points rather than by extracting a few high level rules. As a result, neural networks trained on complex real world data are inherently hard to interpret and prone to failure if asked to extrapolate. To show how we might be able to trust AI despite these problems we introduce the concept of self-explaining AI. Self-explaining AIs are capable of providing a human-understandable explanation of each decision along with confidence levels for both the decision and explanation. For this approach to work, it is important that the explanation actually be related to the decision, ideally capturing the mechanism used to arrive at the explanation. Finally, we argue it is important that deep learning based systems include a warning light based on techniques from applicability domain analysis to warn the user if a model is asked to extrapolate outside its training distribution. For a video presentation of this talk see https://www.youtube.com/watch?v=Py7PVdcu7WY& .
The vertebral levels of the spine provide a useful coordinate system when making measurements of plaque, muscle, fat, and bone mineral density. Correctly classifying vertebral levels with high accuracy is challenging due to the similar appearance of each vertebra, the curvature of the spine, and the possibility of anomalies such as fractured vertebrae, implants, lumbarization of the sacrum, and sacralization of L5. The goal of this work is to develop a system that can accurately and robustly identify the L1 level in large heterogeneous datasets. The first approach we study is using a 3D U-Net to segment the L1 vertebra directly using the entire scan volume to provide context. We also tested models for two class segmentation of L1 and T12 and a three class segmentation of L1, T12 and the rib attached to T12. By increasing the number of training examples to 249 scans using pseudo-segmentations from an in-house segmentation tool we were able to achieve 98% accuracy with respect to identifying the L1 vertebra, with an average error of 4.5 mm in the craniocaudal level. We next developed an algorithm which performs iterative instance segmentation and classification of the entire spine with a 3D U-Net. We found the instance based approach was able to yield better segmentations of nearly the entire spine, but had lower classification accuracy for L1.
The existence of the exclusion zone (EZ), a layer of water in which plastic microspheres are repelled from hydrophilic surfaces, has now been independently demonstrated by several groups. A better understanding of the mechanisms which generate EZs wo uld help with understanding the possible importance of EZs in biology and in engineering applications such as filtration and microfluidics. Here we review the experimental evidence for EZ phenomena in water and the major theories that have been proposed. We review experimental results from birefringence, neutron radiography, nuclear magnetic resonance, and other studies. Pollack and others have theorized that water in the EZ exists has a different structure than bulk water, and that this accounts for the EZ. We present several alternative explanations for EZs and argue that Schurrs theory based on diffusiophoresis presents a compelling alternative explanation for the core EZ phenomenon. Among other things, Schurrs theory makes predictions about the growth of the EZ with time which have been confirmed by Florea et al. and others. We also touch on several possible confounding factors that make experimentation on EZs difficult, such as charged surface groups, dissolved solutes, and adsorbed nanobubbles.
The heat transfer properties of the organic molecular crystal ${alpha}$-RDX were studied using three phonon-based thermal conductivity models. It was found that the widely used Peierls-Boltzmann model for thermal transport in crystalline materials br eaks down for ${alpha}$-RDX. We show this breakdown is due to a large degree of anharmonicity that leads to a dominance of diffusive-like carriers. Despite being developed for disordered systems, the Allen-Feldman theory for thermal conductivity actually gives the best description of thermal transport. This is likely because diffusive carriers contribute to over 95% of the thermal conductivity in ${alpha}$-RDX. The dominance of diffusive carriers is larger than previously observed in other fully ordered crystalline systems. These results indicate than van-der Waals bonded organic crystalline solids conduct heat in a manner more akin to amorphous materials than simple atomic crystals.
In the space of only a few years, deep generative modeling has revolutionized how we think of artificial creativity, yielding autonomous systems which produce original images, music, and text. Inspired by these successes, researchers are now applying deep generative modeling techniques to the generation and optimization of molecules - in our review we found 45 papers on the subject published in the past two years. These works point to a future where such systems will be used to generate lead molecules, greatly reducing resources spent downstream synthesizing and characterizing bad leads in the lab. In this review we survey the increasingly complex landscape of models and representation schemes that have been proposed. The four classes of techniques we describe are recursive neural networks, autoencoders, generative adversarial networks, and reinforcement learning. After first discussing some of the mathematical fundamentals of each technique, we draw high level connections and comparisons with other techniques and expose the pros and cons of each. Several important high level themes emerge as a result of this work, including the shift away from the SMILES string representation of molecules towards more sophisticated representations such as graph grammars and 3D representations, the importance of reward function design, the need for better standards for benchmarking and testing, and the benefits of adversarial training and reinforcement learning over maximum likelihood based training.
The number of scientific journal articles and reports being published about energetic materials every year is growing exponentially, and therefore extracting relevant information and actionable insights from the latest research is becoming a consider able challenge. In this work we explore how techniques from natural language processing and machine learning can be used to automatically extract chemical insights from large collections of documents. We first describe how to download and process documents from a variety of sources - journal articles, conference proceedings (including NTREM), the US Patent & Trademark Office, and the Defense Technical Information Center archive on archive.org. We present a custom NLP pipeline which uses open source NLP tools to identify the names of chemical compounds and relates them to function words (underwater, rocket, pyrotechnic) and property words (elastomer, non-toxic). After explaining how word embeddings work we compare the utility of two popular word embeddings - word2vec and GloVe. Chemical-chemical and chemical-application relationships are obtained by doing computations with word vectors. We show that word embeddings capture latent information about energetic materials, so that related materials appear close together in the word embedding space.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا