Learning to increase matching efficiency in identifying additional b-jets in the $text{t}bar{text{t}}text{b}bar{text{b}}$ process


Abstract in English

The $text{t}bar{text{t}}text{H}(text{b}bar{text{b}})$ process is an essential channel to reveal the Higgs properties but has an irreducible background from the $text{t}bar{text{t}}text{b}bar{text{b}}$ process, which produces a top quark pair in association with a b quark pair. Therefore, understanding the $text{t}bar{text{t}}text{b}bar{text{b}}$ process is crucial for improving the sensitivity of a search for the $text{t}bar{text{t}}text{H}(text{b}bar{text{b}})$ process. To this end, when measuring the differential cross-section of the $text{t}bar{text{t}}text{b}bar{text{b}}$ process, we need to distinguish the b-jets originated from top quark decays, and additional b-jets originated from gluon splitting. Since there are no simple identification rules, we adopt deep learning methods to learn from data to identify the additional b-jets from the $text{t}bar{text{t}}text{b}bar{text{b}}$ events. Specifically, by exploiting the special structure of the $text{t}bar{text{t}}text{b}bar{text{b}}$ event data, we propose several loss functions that can be minimized to directly increase the matching efficiency, the accuracy of identifying additional b-jets. We discuss the difference between our method and another deep learning-based approach based on binary classification arXiv:1910.14535 using synthetic data. We then verify that additional b-jets can be identified more accurately by increasing matching efficiency directly rather than the binary classification accuracy, using simulated $text{t}bar{text{t}}text{b}bar{text{b}}$ event data in the lepton+jets channel from pp collision at $sqrt{s}$ = 13 TeV.

Download