This chapter considers the computational and statistical aspects of learning linear thresholds in presence of noise. When there is no noise, several algorithms exist that efficiently learn near-optimal linear thresholds using a small amount of data. However, even a small amount of adversarial noise makes this problem notoriously hard in the worst-case. We discuss approaches for dealing with these negative results by exploiting natural assumptions on the data-generating process.