مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Advancements in Big Data Processing in the ATLAS and CMS Experiments

128 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Alexandre Vaniachine

تاريخ النشر 2013

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف A.V. Vaniachine

النظم الموزعة والتوازية والحوسبة العنقودية قواعد البيانات فيزياء الطاقة العالية - التجربة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The ever-increasing volumes of scientific data present new challenges for distributed computing and Grid technologies. The emerging Big Data revolution drives exploration in scientific fields including nanotechnology, astrophysics, high-energy physics, biology and medicine. New initiatives are transforming data-driven scientific fields enabling massive data analysis in new ways. In petascale data processing scientists deal with datasets, not individual files. As a result, a task (comprised of many jobs) became a unit of petascale data processing on the Grid. Splitting of a large data processing task into jobs enabled fine-granularity checkpointing analogous to the splitting of a large file into smaller TCP/IP packets during data transfers. Transferring large data in small packets achieves reliability through automatic re-sending of the dropped TCP/IP packets. Similarly, transient job failures on the Grid can be recovered by automatic re-tries to achieve reliable six sigma production quality in petascale data processing on the Grid. The computing experience of the ATLAS and CMS experiments provides foundation for reliability engineering scaling up Grid technologies for data processing beyond the petascale.

قيم البحث

68 - A. Vaniachine , D. Malon 2003

For efficiency of the large production tasks distributed worldwide, it is essential to provide shared production management tools comprised of integratable and interoperable services. To enhance the ATLAS DC1 production toolkit, we introduced and tes ted a Virtual Data services component. For each major data transformation step identified in the ATLAS data processing pipeline (event generation, detector simulation, background pile-up and digitization, etc) the Virtual Data Cookbook (VDC) catalogue encapsulates the specific data transformation knowledge and the validated parameters settings that must be provided before the data transformation invocation. To provide for local-remote transparency during DC1 production, the VDC database server delivered in a controlled way both the validated production parameters and the templated production recipes for thousands of the event generation and detector simulation jobs around the world, simplifying the production management solutions.

النظم الموزعة والتوازية والحوسبة العنقودية قواعد البيانات

The Future is Big Graphs! A Community View on Graph Processing Systems

113 - Sherif Sakr , Angela Bonifati , Hannes Voigt 2020

Graphs are by nature unifying abstractions that can leverage interconnectedness to represent, explore, predict, and explain real- and digital-world phenomena. Although real users and consumers of graph instances and graph workloads understand these a bstractions, future problems will require new abstractions and systems. What needs to happen in the next decade for big graph processing to continue to succeed?

النظم الموزعة والتوازية والحوسبة العنقودية قواعد البيانات

Unified System for Processing Real and Simulated Data in the ATLAS Experiment

94 - Mikhail Borodin , Kaushik De , Jose Garcia Navarro 2015

The physics goals of the next Large Hadron Collider run include high precision tests of the Standard Model and searches for new physics. These goals require detailed comparison of data with computational models simulating the expected data behavior. To highlight the role which modeling and simulation plays in future scientific discovery, we report on use cases and experience with a unified system built to process both real and simulated data of growing volume and variety.

النظم الموزعة والتوازية والحوسبة العنقودية فيزياء الطاقة العالية - التجربة أجهزة الكشف الفيزيائية

Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

205 - Samiya Khan , Xiufeng Liu , Syed Arshad Ali 2019

Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriat e storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed.

النظم الموزعة والتوازية والحوسبة العنقودية قواعد البيانات

Probabilistic Skyline Query Processing over Uncertain Data Streams in Edge Computing Environments

88 - Chuan-Chi Lai , Chuan-Ming Liu , Yan-Lin Chen 2020

With the advancement of technology, the data generated in our lives is getting faster and faster, and the amount of data that various applications need to process becomes extremely huge. Therefore, we need to put more effort into analyzing data and e xtracting valuable information. Cloud computing used to be a good technology to solve a large number of data analysis problems. However, in the era of the popularity of the Internet of Things (IoT), transmitting sensing data back to the cloud for centralized data analysis will consume a lot of wireless communication and network transmission costs. To solve the above problems, edge computing has become a promising solution. In this paper, we propose a new algorithm for processing probabilistic skyline queries over uncertain data streams in an edge computing environment. We use the concept of a second skyline set to filter data that is unlikely to be the result of the skyline. Besides, the edge server only sends the information needed to update the global analysis results on the cloud server, which will greatly reduce the amount of data transmitted over the network. The results show that our proposed method not only reduces the response time by more than 50% compared with the brute force method on two-dimensional data but also maintains the leading processing speed on high-dimensional data.

النظم الموزعة والتوازية والحوسبة العنقودية قواعد البيانات بنى وهياكل البيانات والخوارزميات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة العربية الخاصة للعلوم والتكنولوجيا

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Advancements in Big Data Processing in the ATLAS and CMS Experiments

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً