No Arabic abstract
Traditional paradigms for imaging rely on the use of a spatial structure, either in the detector (pixels arrays) or in the illumination (patterned light). Removal of the spatial structure in the detector or illumination, i.e., imaging with just a single-point sensor, would require solving a very strongly ill-posed inverse retrieval problem that to date has not been solved. Here, we demonstrate a data-driven approach in which full 3D information is obtained with just a single-point, single-photon avalanche diode that records the arrival time of photons reflected from a scene that is illuminated with short pulses of light. Imaging with single-point time-of-flight (temporal) data opens new routes in terms of speed, size, and functionality. As an example, we show how the training based on an optical time-of-flight camera enables a compact radio-frequency impulse radio detection and ranging transceiver to provide 3D images.
Data imbalance is a major problem that affects several machine learning (ML) algorithms. Such a problem is troublesome because most of the ML algorithms attempt to optimize a loss function that does not take into account the data imbalance. Accordingly, the ML algorithm simply generates a trivial model that is biased toward predicting the most frequent class in the training data. In the case of histopathologic images (HIs), both low-level and high-level data augmentation (DA) techniques still present performance issues when applied in the presence of inter-patient variability; whence the model tends to learn color representations, which is related to the staining process. In this paper, we propose a novel approach capable of not only augmenting HI dataset but also distributing the inter-patient variability by means of image blending using the Gaussian-Laplacian pyramid. The proposed approach consists of finding the Gaussian pyramids of two images of different patients and finding the Laplacian pyramids thereof. Afterwards, the left-half side and the right-half side of different HIs are joined in each level of the Laplacian pyramid, and from the joint pyramids, the original image is reconstructed. This composition combines the stain variation of two patients, avoiding that color differences mislead the learning process. Experimental results on the BreakHis dataset have shown promising gains vis-a-vis the majority of DA techniques presented in the literature.
Traditional video compression technologies have been developed over decades in pursuit of higher coding efficiency. Efficient temporal information representation plays a key role in video coding. Thus, in this paper, we propose to exploit the temporal correlation using both first-order optical flow and second-order flow prediction. We suggest an one-stage learning approach to encapsulate flow as quantized features from consecutive frames which is then entropy coded with adaptive contexts conditioned on joint spatial-temporal priors to exploit second-order correlations. Joint priors are embedded in autoregressive spatial neighbors, co-located hyper elements and temporal neighbors using ConvLSTM recurrently. We evaluate our approach for the low-delay scenario with High-Efficiency Video Coding (H.265/HEVC), H.264/AVC and another learned video compression method, following the common test settings. Our work offers the state-of-the-art performance, with consistent gains across all popular test sequences.
Data and data sources have become increasingly essential in recent decades. Scientists and researchers require more data to deploy AI approaches as the field continues to improve. In recent years, the rapid technological advancements have had a significant impact on human existence. One major field for collecting data is satellite technology. With the fast development of various satellite sensor equipment, synthetic aperture radar (SAR) images have become an important source of data for a variety of research subjects, including environmental studies, urban studies, coastal extraction, water sources, etc. Change detection and coastline detection are both achieved using SAR pictures. However, speckle noise is a major problem in SAR imaging. Several solutions have been offered to address this issue. One solution is to expose SAR images to spatial fuzzy clustering. Another solution is to separate speech. This study utilises the spatial function to overcome speckle noise and cluster the SAR images with the highest achieved accuracy. The spatial function is proposed in this work since the likelihood of data falling into one cluster is what this function is all about. When the spatial function is employed to cluster data in fuzzy logic, the clustering outcomes improve. The proposed clustering technique is us
X-ray examination is suitable for screening of gastric cancer. Compared to endoscopy, which can only be performed by doctors, X-ray imaging can also be performed by radiographers, and thus, can treat more patients. However, the diagnostic accuracy of gastric radiographs is as low as 85%. To address this problem, highly accurate and quantitative automated diagnosis using machine learning needs to be performed. This paper proposes a diagnostic support method for detecting gastric cancer sites from X-ray images with high accuracy. The two new technical proposal of the method are (1) stochastic functional gastric image augmentation (sfGAIA), and (2) hard boundary box training (HBBT). The former is a probabilistic enhancement of gastric folds in X-ray images based on medical knowledge, whereas the latter is a recursive retraining technique to reduce false positives. We use 4,724 gastric radiographs of 145 patients in clinical practice and evaluate the cancer detection performance of the method in a patient-based five-group cross-validation. The proposed sfGAIA and HBBT significantly enhance the performance of the EfficientDet-D7 network by 5.9% in terms of the F1-score, and our screening method reaches a practical screening capability for gastric cancer (F1: 57.8%, recall: 90.2%, precision: 42.5%).
Echo-location is a broad approach to imaging and sensing that includes both man-made RADAR, LIDAR, SONAR and also animal navigation. However, full 3D information based on echo-location requires some form of scanning of the scene in order to provide the spatial location of the echo origin-points. Without this spatial information, imaging objects in 3D is a very challenging task as the inverse retrieval problem is strongly ill-posed. Here, we show that the temporal information encoded in the return echoes that are reflected multiple times within a scene is sufficient to faithfully render an image in 3D. Numerical modelling and an information theoretic perspective prove the concept and provide insight into the role of the multipath information. We experimentally demonstrate the concept by using both radio-frequency and acoustic waves for imaging individuals moving in a closed environment.