On When and How to use SAT to Mine Frequent Itemsets


Abstract in English

A new stream of research was born in the last decade with the goal of mining itemsets of interest using Constraint Programming (CP). This has promoted a natural way to combine complex constraints in a highly flexible manner. Although CP state-of-the-art solutions formulate the task using Boolean variables, the few attempts to adopt propositional Satisfiability (SAT) provided an unsatisfactory performance. This work deepens the study on when and how to use SAT for the frequent itemset mining (FIM) problem by defining different encodings with multiple task-driven enumeration options and search strategies. Although for the majority of the scenarios SAT-based solutions appear to be non-competitive with CP peers, results show a variety of interesting cases where SAT encodings are the best option.

Download