ترغب بنشر مسار تعليمي؟ اضغط هنا

When studying the expressive power of neural networks, a main challenge is to understand how the size and depth of the network affect its ability to approximate real functions. However, not all functions are interesting from a practical viewpoint: fu nctions of interest usually have a polynomially-bounded Lipschitz constant, and can be computed efficiently. We call functions that satisfy these conditions benign, and explore the benefits of size and depth for approximation of benign functions with ReLU networks. As we show, this problem is more challenging than the corresponding problem for non-benign functions. We give barriers to showing depth-lower-bounds: Proving existence of a benign function that cannot be approximated by polynomial-size networks of depth $4$ would settle longstanding open problems in computational complexity. It implies that beyond depth $4$ there is a barrier to showing depth-separation for benign functions, even between networks of constant depth and networks of nonconstant depth. We also study size-separation, namely, whether there are benign functions that can be approximated with networks of size $O(s(d))$, but not with networks of size $O(s(d))$. We show a complexity-theoretic barrier to proving such results beyond size $O(dlog^2(d))$, but also show an explicit benign function, that can be approximated with networks of size $O(d)$ and not with networks of size $o(d/log d)$. For approximation in $L_infty$ we achieve such separation already between size $O(d)$ and size $o(d)$. Moreover, we show superpolynomial size lower bounds and barriers to such lower bounds, depending on the assumptions on the function. Our size-separation results rely on an analysis of size lower bounds for Boolean functions, which is of independent interest: We show linear size lower bounds for computing explicit Boolean functions with neural networks and threshold circuits.
In many application areas---lending, education, and online recommenders, for example---fairness and equity concerns emerge when a machine learning system interacts with a dynamically changing environment to produce both immediate and long-term effect s for individuals and demographic groups. We discuss causal directed acyclic graphs (DAGs) as a unifying framework for the recent literature on fairness in such dynamical systems. We show that this formulation affords several new directions of inquiry to the modeler, where causal assumptions can be expressed and manipulated. We emphasize the importance of computing interventional quantities in the dynamical fairness setting, and show how causal assumptions enable simulation (when environment dynamics are known) and off-policy estimation (when dynamics are unknown) of intervention on short- and long-term outcomes, at both the group and individual levels.
How do we learn from biased data? Historical datasets often reflect historical prejudices; sensitive or protected attributes may affect the observed treatments and outcomes. Classification algorithms tasked with predicting outcomes accurately from th ese datasets tend to replicate these biases. We advocate a causal modeling approach to learning from biased data, exploring the relationship between fair classification and intervention. We propose a causal model in which the sensitive attribute confounds both the treatment and the outcome. Building on prior work in deep learning and generative modeling, we describe how to learn the parameters of this causal model from observational data alone, even in the presence of unobserved confounders. We show experimentally that fairness-aware causal modeling provides better estimates of the causal effects between the sensitive attribute, the treatment, and the outcome. We further present evidence that estimating these causal effects can help learn policies that are both more accurate and fair, when presented with a historically biased dataset.
In this paper, we advocate for representation learning as the key to mitigating unfair prediction outcomes downstream. Motivated by a scenario where learned representations are used by third parties with unknown objectives, we propose and explore adv ersarial representation learning as a natural method of ensuring those parties act fairly. We connect group fairness (demographic parity, equalized odds, and equal opportunity) to different adversarial objectives. Through worst-case theoretical guarantees and experimental validation, we show that the choice of this objective is crucial to fair prediction. Furthermore, we present the first in-depth experimental demonstration of fair transfer learning and demonstrate empirically that our learned representations admit fair predictions on new tasks while maintaining utility, an essential goal of fair representation learning.
In many machine learning applications, there are multiple decision-makers involved, both automated and human. The interaction between these agents often goes unaddressed in algorithmic development. In this work, we explore a simple version of this in teraction with a two-stage framework containing an automated model and an external decision-maker. The model can choose to say Pass, and pass the decision downstream, as explored in rejection learning. We extend this concept by proposing learning to defer, which generalizes rejection learning by considering the effect of other agents in the decision-making process. We propose a learning algorithm which accounts for potential biases held by external decision-makers in a system. Experiments demonstrate that learning to defer can make systems not only more accurate but also less biased. Even when working with inconsistent or biased users, we show that deferring models still greatly improve the accuracy and/or fairness of the entire system.
We study the multiparty communication complexity of high dimensional permutations, in the Number On the Forehead (NOF) model. This model is due to Chandra, Furst and Lipton (CFL) who also gave a nontrivial protocol for the Exactly-n problem where thr ee players receive integer inputs and need to decide if their inputs sum to a given integer $n$. There is a considerable body of literature dealing with the same problem, where $(mathbb{N},+)$ is replaced by some other abelian group. Our work can be viewed as a far-reaching extension of this line of work. We show that the known lower bounds for that group-theoretic problem apply to all high dimensional permutations. We introduce new proof techniques that appeal to recent advances in Additive Combinatorics and Ramsey theory. We reveal new and unexpected connections between the NOF communication complexity of high dimensional permutations and a variety of well known and thoroughly studied problems in combinatorics. Previous protocols for Exactly-n all rely on the construction of large sets of integers without a 3-term arithmetic progression. No direct algorithmic protocol was previously known for the problem, and we provide the first such algorithm. This suggests new ways to significantly improve the CFL protocol. Many new open questions are presented throughout.
The random k-SAT model is the most important and well-studied distribution over k-SAT instances. It is closely connected to statistical physics; it is used as a testbench for satisfiability algorithms, and average-case hardness over this distribution has also been linked to hardness of approximation via Feiges hypothesis. We prove that any Cutting Planes refutation for random k-SAT requires exponential size, for k that is logarithmic in the number of variables, in the (interesting) regime where the number of clauses guarantees that the formula is unsatisfiable with high probability.
We introduce a new algebraic proof system, which has tight connections to (algebraic) circuit complexity. In particular, we show that any super-polynomial lower bound on any Boolean tautology in our proof system implies that the permanent does not ha ve polynomial-size algebraic circuits (VNP is not equal to VP). As a corollary to the proof, we also show that super-polynomial lower bounds on the number of lines in Polynomial Calculus proofs (as opposed to the usual measure of number of monomials) imply the Permanent versus Determinant Conjecture. Note that, prior to our work, there was no proof system for which lower bounds on an arbitrary tautology implied any computational lower bound. Our proof system helps clarify the relationships between previous algebraic proof systems, and begins to shed light on why proof complexity lower bounds for various proof systems have been so much harder than lower bounds on the corresponding circuit classes. In doing so, we highlight the importance of polynomial identity testing (PIT) for understanding proof complexity. More specifically, we introduce certain propositional axioms satisfied by any Boolean circuit computing PIT. We use these PIT axioms to shed light on AC^0[p]-Frege lower bounds, which have been open for nearly 30 years, with no satisfactory explanation as to their apparent difficulty. We show that either: a) Proving super-polynomial lower bounds on AC^0[p]-Frege implies VNP does not have polynomial-size circuits of depth d - a notoriously open question for d at least 4 - thus explaining the difficulty of lower bounds on AC^0[p]-Frege, or b) AC^0[p]-Frege cannot efficiently prove the depth d PIT axioms, and hence we have a lower bound on AC^0[p]-Frege. Using the algebraic structure of our proof system, we propose a novel way to extend techniques from algebraic circuit complexity to prove lower bounds in proof complexity.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا