ترغب بنشر مسار تعليمي؟ اضغط هنا

A Multi-Pass Approach to Large-Scale Connectomics

82   0   0.0 ( 0 )
 نشر من قبل Yaron Meirovitch
 تاريخ النشر 2016
والبحث باللغة English




اسأل ChatGPT حول البحث

The field of connectomics faces unprecedented big data challenges. To reconstruct neuronal connectivity, automated pixel-level segmentation is required for petabytes of streaming electron microscopy data. Existing algorithms provide relatively good accuracy but are unacceptably slow, and would require years to extract connectivity graphs from even a single cubic millimeter of neural tissue. Here we present a viable real-time solution, a multi-pass pipeline optimized for shared-memory multicore systems, capable of processing data at near the terabyte-per-hour pace of multi-beam electron microscopes. The pipeline makes an initial fast-pass over the data, and then makes a second slow-pass to iteratively correct errors in the output of the fast-pass. We demonstrate the accuracy of a sparse slow-pass reconstruction algorithm and suggest new methods for detecting morphological errors. Our fast-pass approach provided many algorithmic challenges, including the design and implementation of novel shallow convolutional neural nets and the parallelization of watershed and object-merging techniques. We use it to reconstruct, from image stack to skeletons, the full dataset of Kasthuri et al. (463 GB capturing 120,000 cubic microns) in a matter of hours on a single multicore machine rather than the weeks it has taken in the past on much larger distributed systems.



قيم البحث

اقرأ أيضاً

Imaging methods used in modern neuroscience experiments are quickly producing large amounts of data capable of providing increasing amounts of knowledge about neuroanatomy and function. A great deal of information in these datasets is relatively unex plored and untapped. One of the bottlenecks in knowledge extraction is that often there is no feedback loop between the knowledge produced (e.g., graph, density estimate, or other statistic) and the earlier stages of the pipeline (e.g., acquisition). We thus advocate for the development of sample-to-knowledge discovery pipelines that one can use to optimize acquisition and processing steps with a particular end goal (i.e., piece of knowledge) in mind. We therefore propose that optimization takes place not just within each processing stage but also between adjacent (and non-adjacent) steps of the pipeline. Furthermore, we explore the existing categories of knowledge representation and models to motivate the types of experiments and analysis needed to achieve the ultimate goal. To illustrate this approach, we provide an experimental paradigm to answer questions about large-scale synaptic distributions through a multimodal approach combining X-ray microtomography and electron microscopy.
Reconstructing a map of neuronal connectivity is a critical challenge in contemporary neuroscience. Recent advances in high-throughput serial section electron microscopy (EM) have produced massive 3D image volumes of nanoscale brain tissue for the fi rst time. The resolution of EM allows for individual neurons and their synaptic connections to be directly observed. Recovering neuronal networks by manually tracing each neuronal process at this scale is unmanageable, and therefore researchers are developing automated image processing modules. Thus far, state-of-the-art algorithms focus only on the solution to a particular task (e.g., neuron segmentation or synapse identification). In this manuscript we present the first fully automated images-to-graphs pipeline (i.e., a pipeline that begins with an imaged volume of neural tissue and produces a brain graph without any human interaction). To evaluate overall performance and select the best parameters and methods, we also develop a metric to assess the quality of the output graphs. We evaluate a set of algorithms and parameters, searching possible operating points to identify the best available brain graph for our assessment metric. Finally, we deploy a reference end-to-end version of the pipeline on a large, publicly available data set. This provides a baseline result and framework for community analysis and future algorithm development and testing. All code and data derivatives have been made publicly available toward eventually unlocking new biofidelic computational primitives and understanding of neuropathologies.
Modern technologies are enabling scientists to collect extraordinary amounts of complex and sophisticated data across a huge range of scales like never before. With this onslaught of data, we can allow the focal point to shift towards answering the q uestion of how we can analyze and understand the massive amounts of data in front of us. Unfortunately, lack of standardized sharing mechanisms and practices often make reproducing or extending scientific results very difficult. With the creation of data organization structures and tools which drastically improve code portability, we now have the opportunity to design such a framework for communicating extensible scientific discoveries. Our proposed solution leverages these existing technologies and standards, and provides an accessible and extensible model for reproducible research, called science in the cloud (sic). Exploiting scientific containers, cloud computing and cloud data services, we show the capability to launch a computer in the cloud and run a web service which enables intimate interaction with the tools and data presented. We hope this model will inspire the community to produce reproducible and, importantly, extensible results which will enable us to collectively accelerate the rate at which scientific breakthroughs are discovered, replicated, and extended.
96 - Etay Ziv 2004
Exploiting recent developments in information theory, we propose, illustrate, and validate a principled information-theoretic algorithm for module discovery and resulting measure of network modularity. This measure is an order parameter (a dimensionl ess number between 0 and 1). Comparison is made to other approaches to module-discovery and to quantifying network modularity using Monte Carlo generated Erdos-like modular networks. Finally, the Network Information Bottleneck (NIB) algorithm is applied to a number of real world networks, including the social network of coauthors at the APS March Meeting 2004.
Identification and quantification of condition-specific transcripts using RNA-Seq is vital in transcriptomics research. While initial efforts using mathematical or statistical modeling of read counts or per-base exonic signal have been successful, th ey may suffer from model overfitting since not all the reference transcripts in a database are expressed under a specific biological condition. Standard shrinkage approaches, such as Lasso, shrink all the transcript abundances to zero in a non-discriminative manner. Thus it does not necessarily yield the set of condition-specific transcripts. Informed shrinkage approaches, using the observed exonic coverage signal, are thus desirable. Motivated by ubiquitous uncovered exonic regions in RNA-Seq data, termed as naked exons, we propose a new computational approach that first filters out the reference transcripts not supported by splicing and paired-end reads, then followed by fitting a new mathematical model of per-base exonic coverage signal and the underlying transcripts structure. We introduce a tuning parameter to penalize the specific regions of the selected transcripts that were not supported by the naked exons. Our approach compares favorably with the selected competing methods in terms of both time complexity and accuracy using simulated and real-world data. Our method is implemented in SAMMate, a GUI software suite freely available from http://sammate.sourceforge.net

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا