Federated learning is an appealing framework for analyzing sensitive data from distributed health data networks. Under this framework, data partners at local sites collaboratively build an analytical model under the orchestration of a coordinating site, while keeping the data decentralized. While integrating information from multiple sources may boost statistical efficiency, existing federated learning methods mainly assume data across sites are homogeneous samples of the global population, failing to properly account for the extra variability across sites in estimation and inference. Drawing on a multi-hospital electronic health records network, we develop an efficient and interpretable tree-based ensemble of personalized treatment effect estimators to join results across hospital sites, while actively modeling for the heterogeneity in data sources through site partitioning. The efficiency of this approach is demonstrated by a study of causal effects of oxygen saturation on hospital mortality and backed up by comprehensive numerical results.