بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Irreducible Frequent Patterns in Transactional Databases

69 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Vyacheslav Gorshkov Mr

تاريخ النشر 2005

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Gennady P.Berman - Vyacheslavn N.Gorshkov (Los Alamos National Laboratory

بنى وهياكل البيانات والخوارزميات قواعد البيانات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Irreducible frequent patters (IFPs) are introduced for transactional databases. An IFP is such a frequent pattern (FP),(x1,x2,...xn), the probability of which, P(x1,x2,...xn), cannot be represented as a product of the probabilities of two (or more) other FPs of the smaller lengths. We have developed an algorithm for searching IFPs in transactional databases. We argue that IFPs represent useful tools for characterizing the transactional databases and may have important applications to bio-systems including the immune systems and for improving vaccination strategies. The effectiveness of the IFPs approach has been illustrated in application to a classification problem.

قيم البحث

59 - Ran M. Bittmann , Philippe Nemery , Xingtian Shi 2018

Frequent Item-set Mining (FIM), sometimes called Market Basket Analysis (MBA) or Association Rule Learning (ARL), are Machine Learning (ML) methods for creating rules from datasets of transactions of items. Most methods identify items likely to appea r together in a transaction based on the support (i.e. a minimum number of relative co-occurrence of the items) for that hypothesis. Although this is a good indicator to measure the relevance of the assumption that these items are likely to appear together, the phenomenon of very frequent items, referred to as ubiquitous items, is not addressed in most algorithms. Ubiquitous items have the same entropy as infrequent items, and not contributing significantly to the knowledge. On the other hand, they have strong effect on the performance of the algorithms and sometimes preventing the convergence of the FIM algorithms and thus the provision of meaningful results. This paper discusses the phenomenon of ubiquitous items and demonstrates how ignoring these has a dramatic effect on the computation performances but with a low and controlled effect on the significance of the results.

بنى وهياكل البيانات والخوارزميات قواعد البيانات

Polynesia: Enabling Effective Hybrid Transactional/Analytical Databases with Specialized Hardware/Software Co-Design

131 - Amirali Boroumand , Saugata Ghose , Geraldo F. Oliveira 2021

An exponential growth in data volume, combined with increasing demand for real-time analysis (i.e., using the most recent data), has resulted in the emergence of database systems that concurrently support transactions and data analytics. These hybrid transactional and analytical processing (HTAP) database systems can support real-time data analysis without the high costs of synchronizing across separate single-purpose databases. Unfortunately, for many applications that perform a high rate of data updates, state-of-the-art HTAP systems incur significant drops in transactional (up to 74.6%) and/or analytical (up to 49.8%) throughput compared to performing only transactions or only analytics in isolation, due to (1) data movement between the CPU and memory, (2) data update propagation, and (3) consistency costs. We propose Polynesia, a hardware-software co-designed system for in-memory HTAP databases. Polynesia (1) divides the HTAP system into transactional and analytical processing islands, (2) implements custom algorithms and hardware to reduce the costs of update propagation and consistency, and (3) exploits processing-in-memory for the analytical islands to alleviate data movement. Our evaluation shows that Polynesia outperforms three state-of-the-art HTAP systems, with average transactional/analytical throughput improvements of 1.70X/3.74X, and reduces energy consumption by 48% over the prior lowest-energy system.

هندسة العتاد قواعد البيانات

Securing Databases from Probabilistic Inference

120 - Marco Guarnieri , Srdjan Marinovic , David Basin 2017

Databases can leak confidential information when users combine query results with probabilistic data dependencies and prior knowledge. Current research offers mechanisms that either handle a limited class of dependencies or lack tractable enforcement algorithms. We propose a foundation for Database Inference Control based on ProbLog, a probabilistic logic programming language. We leverage this foundation to develop Angerona, a provably secure enforcement mechanism that prevents information leakage in the presence of probabilistic dependencies. We then provide a tractable inference algorithm for a practically relevant fragment of ProbLog. We empirically evaluate Angeronas performance showing that it scales to relevant security-critical problems.

التشفير والأمن قواعد البيانات

Semantically Enhanced Time Series Databases in IoT-Edge-Cloud Infrastructure

114 - Shuai Zhang , Wenxi Zeng , I-Ling Yen 2019

Many IoT systems are data intensive and are for the purpose of monitoring for fault detection and diagnosis of critical systems. A large volume of data steadily come out of a large number of sensors in the monitoring system. Thus, we need to consider how to store and manage these data. Existing time series databases (TSDBs) can be used for monitoring data storage, but they do not have good models for describing the data streams stored in the database. In this paper, we develop a semantic model for the specification of the monitoring data streams (time series data) in terms of which sensor generated the data stream, which metric of which entity the sensor is monitoring, what is the relation of the entity to other entities in the system, which measurement unit is used for the data stream, etc. We have also developed a tool suite, SE-TSDB, that can run on top of existing TSDBs to help establish semantic specifications for data streams and enable semantic-based data retrievals. With our semantic model for monitoring data and our SE-TSDB tool suite, users can retrieve non-existing data streams that can be automatically derived from the semantics. Users can also retrieve data streams without knowing where they are. Semantic based retrieval is especially important in a large-scale integrated IoT-Edge-Cloud system, because of its sheer quantity of data, its huge number of computing and IoT devices that may store the data, and the dynamics in data migration and evolution. With better data semantics, data streams can be more effectively tracked and flexibly retrieved to help with timely data analysis and control decision making anywhere and anytime.

النظم الموزعة والتوازية والحوسبة العنقودية قواعد البيانات

On When and How to use SAT to Mine Frequent Itemsets

337 - Rui Henriques , In^es Lynce , Vasco Manquinho 2012

A new stream of research was born in the last decade with the goal of mining itemsets of interest using Constraint Programming (CP). This has promoted a natural way to combine complex constraints in a highly flexible manner. Although CP state-of-the- art solutions formulate the task using Boolean variables, the few attempts to adopt propositional Satisfiability (SAT) provided an unsatisfactory performance. This work deepens the study on when and how to use SAT for the frequent itemset mining (FIM) problem by defining different encodings with multiple task-driven enumeration options and search strategies. Although for the majority of the scenarios SAT-based solutions appear to be non-competitive with CP peers, results show a variety of interesting cases where SAT encodings are the best option.

الذكاء الاصطناعي قواعد البيانات التعلم الآلي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة حماه

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Irreducible Frequent Patterns in Transactional Databases

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً