No Arabic abstract
Conditions Data in high energy physics experiments is frequently seen as every data needed for reconstruction besides the event data itself. This includes all sorts of slowly evolving data like detector alignment, calibration and robustness, and data from detector control system. Also, every Conditions Data Object is associated with a time interval of validity and a version. Besides that, quite often is useful to tag collections of Conditions Data Objects altogether. These issues have already been investigated and a data model has been proposed and used for different implementations based in commercial DBMSs, both at CERN and for the BaBar experiment. The special case of the ATLAS complex trigger that requires online access to calibration and alignment data poses new challenges that have to be met using a flexible and customizable solution more in the line of Open Source components. Motivated by the ATLAS challenges we have developed an alternative implementation, based in an Open Source RDBMS. Several issues were investigated land will be described in this paper: -The best way to map the conditions data model into the relational database concept considering what are foreseen as the most frequent queries. -The clustering model best suited to address the scalability problem. -Extensive tests were performed and will be described. The very promising results from these tests are attracting the attention from the HEP community and driving further developments.
To remain aware of the fast-evolving cyber threat landscape, open-source Cyber Threat Intelligence (OSCTI) has received growing attention from the community. Commonly, knowledge about threats is presented in a vast number of OSCTI reports. Despite the pressing need for high-quality OSCTI, existing OSCTI gathering and management platforms, however, have primarily focused on isolated, low-level Indicators of Compromise. On the other hand, higher-level concepts (e.g., adversary tactics, techniques, and procedures) and their relationships have been overlooked, which contain essential knowledge about threat behaviors that is critical to uncovering the complete threat scenario. To bridge the gap, we propose SecurityKG, a system for automated OSCTI gathering and management. SecurityKG collects OSCTI reports from various sources, uses a combination of AI and NLP techniques to extract high-fidelity knowledge about threat behaviors, and constructs a security knowledge graph. SecurityKG also provides a UI that supports various types of interactivity to facilitate knowledge graph exploration.
Purpose: A Monte Carlo (MC) beam model and its implementation in a clinical treatment planning system (TPS, Varian Eclipse) are presented for a modified ultra-high dose-rate electron FLASH radiotherapy (eFLASH-RT) LINAC. Methods: The gantry head without scattering foils or targets, representative of the LINAC modifications, was modelled in Geant4. The energy spectrum ({sigma}E) and beam source emittance cone angle ({theta}cone) were varied to match the calculated and Gafchromic film measured central-axis percent depth dose (PDD) and lateral profiles. Its Eclipse configuration was validated with measured profiles of the open field and nominal fields for clinical applicators. eFLASH-RT plans were MC forward calculated in Geant4 for a mouse brain treatment and compared to a conventional (Conv-RT) plan in Eclipse for a human patient with metastatic renal cell carcinoma. Results: The beam model and its Eclipse configuration agreed best with measurements at {sigma}E=0.5 MeV and {theta}cone=3.9+/-0.2 degrees to clinically acceptable accuracy (the absolute average error was within 1.5% for in-water lateral, 3% for in-air lateral, and 2% for PDD). The forward dose calculation showed dose was delivered to the entire mouse brain with adequate conformality. The human patient case demonstrated the planning capability with routine accessories in relatively complex geometry to achieve an acceptable plan (90% of the tumor volume receiving 95% and 90% of the prescribed dose for eFLASH and Conv-RT, respectively). Conclusion: To the best of our knowledge, this is the first functional beam model commissioned in a clinical TPS for eFLASH-RT, enabling planning and evaluation with minimal deviation from Conv-RT workflow. It facilitates the clinical translation as eFLASH-RT and Conv-RT plan quality were comparable for a human patient. The methods can be expanded to model other eFLASH irradiators.
Operational Neural Networks (ONNs) have recently been proposed as a special class of artificial neural networks for grid structured data. They enable heterogenous non-linear operations to generalize the widely adopted convolution-based neuron model. This work introduces a fast GPU-enabled library for training operational neural networks, FastONN, which is based on a novel vectorized formulation of the operational neurons. Leveraging on automatic reverse-mode differentiation for backpropagation, FastONN enables increased flexibility with the incorporation of new operator sets and customized gradient flows. Additionally, bundled auxiliary modules offer interfaces for performance tracking and checkpointing across different data partitions and customized metrics.
Data Lake (DL) is a Big Data analysis solution which ingests raw data in their native format and allows users to process these data upon usage. Data ingestion is not a simple copy and paste of data, it is a complicated and important phase to ensure that ingested data are findable, accessible, interoperable and reusable at all times. Our solution is threefold. Firstly, we propose a metadata model that includes information about external data sources, data ingestion processes, ingested data, dataset veracity and dataset security. Secondly, we present the algorithms that ensure the ingestion phase (data storage and metadata instanciation). Thirdly, we introduce a developed metadata management system whereby users can easily consult different elements stored in DL.
With new emerging technologies, such as satellites and drones, archaeologists collect data over large areas. However, it becomes difficult to process such data in time. Archaeological data also have many different formats (images, texts, sensor data) and can be structured, semi-structured and unstructured. Such variety makes data difficult to collect, store, manage, search and analyze effectively. A few approaches have been proposed, but none of them covers the full data lifecycle nor provides an efficient data management system. Hence, we propose the use of a data lake to provide centralized data stores to host heterogeneous data, as well as tools for data quality checking, cleaning, transformation, and analysis. In this paper, we propose a generic, flexible and complete data lake architecture. Our metadata management system exploits goldMEDAL, which is the most complete metadata model currently available. Finally, we detail the concrete implementation of this architecture dedicated to an archaeological project.