No Arabic abstract
A major challenge in applying machine learning to automated theorem proving is the scarcity of training data, which is a key ingredient in training successful deep learning models. To tackle this problem, we propose an approach that relies on training with synthetic theorems, generated from a set of axioms. We show that such theorems can be used to train an automated prover and that the learned prover transfers successfully to human-generated theorems. We demonstrate that a prover trained exclusively on synthetic theorems can solve a substantial fraction of problems in TPTP, a benchmark dataset that is used to compare state-of-the-art heuristic provers. Our approach outperforms a model trained on human-generated problems in most axiom sets, thereby showing the promise of using synthetic data for this task.
In this paper, we consider the problem of learning a first-order theorem prover that uses a representation of beliefs in mathematical claims to construct proofs. The inspiration for doing so comes from the practices of human mathematicians where plausible reasoning is applied in addition to deductive reasoning to find proofs. Towards this end, we introduce a representation of beliefs that assigns probabilities to the exhaustive and mutually exclusive first-order possibilities found in Hintikkas theory of distributive normal forms. The representation supports Bayesian update, induces a distribution on statements that does not enforce that logically equivalent statements are assigned the same probability, and suggests an embedding of statements into an associated Hilbert space. We then examine conjecturing as model selection and an alternating-turn game of determining consistency. The game is amenable (in principle) to self-play training to learn beliefs and derive a prover that is complete when logical omniscience is attained and sound when beliefs are reasonable. The representation has super-exponential space requirements as a function of quantifier depth so the ideas in this paper should be taken as theoretical. We will comment on how abstractions can be used to control the space requirements at the cost of completeness.
Despite great success in human parsing, progress for parsing other deformable articulated objects, like animals, is still limited by the lack of labeled data. In this paper, we use synthetic images and ground truth generated from CAD animal models to address this challenge. To bridge the domain gap between real and synthetic images, we propose a novel consistency-constrained semi-supervised learning method (CC-SSL). Our method leverages both spatial and temporal consistencies, to bootstrap weak models trained on synthetic data with unlabeled real images. We demonstrate the effectiveness of our method on highly deformable animals, such as horses and tigers. Without using any real image label, our method allows for accurate keypoint prediction on real images. Moreover, we quantitatively show that models using synthetic data achieve better generalization performance than models trained on real images across different domains in the Visual Domain Adaptation Challenge dataset. Our synthetic dataset contains 10+ animals with diverse poses and rich ground truth, which enables us to use the multi-task learning strategy to further boost models performance.
These are lecture notes from a course I gave at the University of Wisconsin during the Spring semester of 1993. Part 1 is concerned with Borel hierarchies. Section 13 contains an unpublished theorem of Fremlin concerning Borel hierarchies and MA. Section 14 and 15 contain new results concerning the lengths of Borel hierarchies in the Cohen and random real model. Part 2 contains standard results on the theory of Analytic sets. Section 25 contains Harringtons Theorem that it is consistent to have $Pi^1_2$ sets of arbitrary cardinality. Part 3 has the usual separation theorems. Part 4 gives some applications of Gandy forcing. We reverse the usual trend and use forcing arguments instead of Baire category. In particular, Louveaus Theorem on $Pi^0_alpha$ hyp-sets has a simpler proof using forcing.
We introduce Prove-It, a Python-based general-purpose interactive theorem-proving assistant designed with the goal of making formal theorem proving as easy and natural as informal theorem proving (with moderate training). Prove-It uses a highly-flexible Jupyter notebook-based user interface that documents interactions and proof steps using LaTeX. We review Prove-Its highly expressive representation of expressions, judgments, theorems, and proofs; demonstrate the system by constructing a traditional proof-by-contradiction that $sqrt{2} otinmathbb{Q}$; and discuss how the system avoids inconsistencies such as Russells and Currys paradoxes. Extensive documentation is provided in the appendices about core elements of the system. Current development and future work includes promising applications to quantum circuit manipulation and quantum algorithm verification.
Synthesizing realistic medical images provides a feasible solution to the shortage of training data in deep learning based medical image recognition systems. However, the quality control of synthetic images for data augmentation purposes is under-investigated, and some of the generated images are not realistic and may contain misleading features that distort data distribution when mixed with real images. Thus, the effectiveness of those synthetic images in medical image recognition systems cannot be guaranteed when they are being added randomly without quality assurance. In this work, we propose a reinforcement learning (RL) based synthetic sample selection method that learns to choose synthetic images containing reliable and informative features. A transformer based controller is trained via proximal policy optimization (PPO) using the validation classification accuracy as the reward. The selected images are mixed with the original training data for improved training of image recognition systems. To validate our method, we take the pathology image recognition as an example and conduct extensive experiments on two histopathology image datasets. In experiments on a cervical dataset and a lymph node dataset, the image classification performance is improved by 8.1% and 2.3%, respectively, when utilizing high-quality synthetic images selected by our RL framework. Our proposed synthetic sample selection method is general and has great potential to boost the performance of various medical image recognition systems given limited annotation.