ترغب بنشر مسار تعليمي؟ اضغط هنا

Identifying important components or factors in large amounts of noisy data is a key problem in machine learning and data mining. Motivated by a pattern decomposition problem in materials discovery, aimed at discovering new materials for renewable ene rgy, e.g. for fuel and solar cells, we introduce CombiFD, a framework for factor based pattern decomposition that allows the incorporation of a-priori knowledge as constraints, including complex combinatorial constraints. In addition, we propose a new pattern decomposition algorithm, called AMIQO, based on solving a sequence of (mixed-integer) quadratic programs. Our approach considerably outperforms the state of the art on the materials discovery problem, scaling to larger datasets and recovering more precise and physically meaningful decompositions. We also show the effectiveness of our approach for enforcing background knowledge on other application domains.
According to the ErdH{o}s discrepancy conjecture, for any infinite $pm 1$ sequence, there exists a homogeneous arithmetic progression of unbounded discrepancy. In other words, for any $pm 1$ sequence $(x_1,x_2,...)$ and a discrepancy $C$, there exist integers $m$ and $d$ such that $|sum_{i=1}^m x_{i cdot d}| > C$. This is an $80$-year-old open problem and recent development proved that this conjecture is true for discrepancies up to $2$. Paul ErdH{o}s also conjectured that this property of unbounded discrepancy even holds for the restricted case of completely multiplicative sequences (CMSs), namely sequences $(x_1,x_2,...)$ where $x_{a cdot b} = x_{a} cdot x_{b}$ for any $a,b geq 1$. The longest CMS with discrepancy $2$ has been proven to be of size $246$. In this paper, we prove that any completely multiplicative sequence of size $127,646$ or more has discrepancy at least $4$, proving the ErdH{o}s discrepancy conjecture for CMSs of discrepancies up to $3$. In addition, we prove that this bound is tight and increases the size of the longest known sequence of discrepancy $3$ from $17,000$ to $127,645$. Finally, we provide inductive construction rules as well as streamlining methods to improve the lower bounds for sequences of higher discrepancies.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا