Sensing the medical scenario can ensure the safety during the surgical operations. So, in this regard, a monitor platform which can obtain the accurate location information of the surgery room is desperately needed. Compared to 2D camera image, 3D data contains more information of distance and direction. Therefore, 3D sensors are more suitable to be used in surgical scene monitoring. However, each 3D sensor has its own limitations. For example, Lidar (Light Detection and Ranging) can detect large-scale environment with high precision, but the point clouds or depth maps are very sparse. As for commodity RGBD sensors, such as Kinect, can accurately capture denser data, but limited to a small range from 0.5 to 4.5m. So, a proper method which can address these problems for fusing different modalities data is important. In this paper, we proposed a method which can fuse different modalities 3D data to get a large-scale and dense point cloud. The key contributions of our work are as follows. First, we proposed a 3D data collecting system to reconstruct the medical scenes. By fusing the Lidar and Kinect data, a large-scale medical scene with more details can be reconstructed. Second, we proposed a location-based fast point clouds registration algorithm to deal with different modality datasets.