To mitigate the radiologists workload, computer-aided diagnosis with the capability to review and analyze medical images is gradually deployed. Deep learning-based region of interest segmentation is among the most exciting use cases. However, this paradigm is restricted in real-world clinical applications due to poor robustness and generalization. The issue is more sinister with a lack of training data. In this paper, we address the challenge from the representation learning point of view. We investigate that the collapsed representations, as one of the main reasons which caused poor robustness and generalization, could be avoided through transfer learning. Therefore, we propose a novel two-stage framework for robust generalized segmentation. In particular, an unsupervised Tile-wise AutoEncoder (T-AE) pretraining architecture is coined to learn meaningful representation for improving the generalization and robustness of the downstream tasks. Furthermore, the learned knowledge is transferred to the segmentation benchmark. Coupled with an image reconstruction network, the representation keeps to be decoded, encouraging the model to capture more semantic features. Experiments of lung segmentation on multi chest X-ray datasets are conducted. Empirically, the related experimental results demonstrate the superior generalization capability of the proposed framework on unseen domains in terms of high performance and robustness to corruption, especially under the scenario of the limited training data.