Forest Representation Learning Guided by Margin Distribution


Abstract in English

In this paper, we reformulate the forest representation learning approach as an additive model which boosts the augmented feature instead of the prediction. We substantially improve the upper bound of generalization gap from $mathcal{O}(sqrtfrac{ln m}{m})$ to $mathcal{O}(frac{ln m}{m})$, while $lambda$ - the margin ratio between the margin standard deviation and the margin mean is small enough. This tighter upper bound inspires us to optimize the margin distribution ratio $lambda$. Therefore, we design the margin distribution reweighting approach (mdDF) to achieve small ratio $lambda$ by boosting the augmented feature. Experiments and visualizations confirm the effectiveness of the approach in terms of performance and representation learning ability. This study offers a novel understanding of the cascaded deep forest from the margin-theory perspective and further uses the mdDF approach to guide the layer-by-layer forest representation learning.

Download