ترغب بنشر مسار تعليمي؟ اضغط هنا

Instance-wise or Class-wise? A Tale of Neighbor Shapley for Concept-based Explanation

102   0   0.0 ( 0 )
 نشر من قبل Jiahui Li
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Deep neural networks have demonstrated remarkable performance in many data-driven and prediction-oriented applications, and sometimes even perform better than humans. However, their most significant drawback is the lack of interpretability, which makes them less attractive in many real-world applications. When relating to the moral problem or the environmental factors that are uncertain such as crime judgment, financial analysis, and medical diagnosis, it is essential to mine the evidence for the models prediction (interpret model knowledge) to convince humans. Thus, investigating how to interpret model knowledge is of paramount importance for both academic research and real applications.



قيم البحث

اقرأ أيضاً

The multivariate time series forecasting has attracted more and more attention because of its vital role in different fields in the real world, such as finance, traffic, and weather. In recent years, many research efforts have been proposed for forec asting multivariate time series. Although some previous work considers the interdependencies among different variables in the same timestamp, existing work overlooks the inter-connections between different variables at different time stamps. In this paper, we propose a simple yet efficient instance-wise graph-based framework to utilize the inter-dependencies of different variables at different time stamps for multivariate time series forecasting. The key idea of our framework is aggregating information from the historical time series of different variables to the current time series that we need to forecast. We conduct experiments on the Traffic, Electricity, and Exchange-Rate multivariate time series datasets. The results show that our proposed model outperforms the state-of-the-art baseline methods.
Shapley values have become one of the most popular feature attribution explanation methods. However, most prior work has focused on post-hoc Shapley explanations, which can be computationally demanding due to its exponential time complexity and precl ude model regularization based on Shapley explanations during training. Thus, we propose to incorporate Shapley values themselves as latent representations in deep models thereby making Shapley explanations first-class citizens in the modeling paradigm. This intrinsic explanation approach enables layer-wise explanations, explanation regularization of the model during training, and fast explanation computation at test time. We define the Shapley transform that transforms the input into a Shapley representation given a specific function. We operationalize the Shapley transform as a neural network module and construct both shallow and deep networks, called ShapNets, by composing Shapley modules. We prove that our Shallow ShapNets compute the exact Shapley values and our Deep ShapNets maintain the missingness and accuracy properties of Shapley values. We demonstrate on synthetic and real-world datasets that our ShapNets enable layer-wise Shapley explanations, novel Shapley regularizations during training, and fast computation while maintaining reasonable performance. Code is available at https://github.com/inouye-lab/ShapleyExplanationNetworks.
118 - Jinhua Zhu , Lijun Wu , Yingce Xia 2021
With sequentially stacked self-attention, (optional) encoder-decoder attention, and feed-forward layers, Transformer achieves big success in natural language processing (NLP), and many variants have been proposed. Currently, almost all these models a ssume that the layer order is fixed and kept the same across data samples. We observe that different data samples actually favor different orders of the layers. Based on this observation, in this work, we break the assumption of the fixed layer order in the Transformer and introduce instance-wise layer reordering into the model structure. Our Instance-wise Ordered Transformer (IOT) can model variant functions by reordered layers, which enables each sample to select the better one to improve the model performance under the constraint of almost the same number of parameters. To achieve this, we introduce a light predictor with negligible parameter and inference cost to decide the most capable and favorable layer order for any input sequence. Experiments on 3 tasks (neural machine translation, abstractive summarization, and code generation) and 9 datasets demonstrate consistent improvements of our method. We further show that our method can also be applied to other architectures beyond Transformer. Our code is released at Github.
Training deep neural networks is known to require a large number of training samples. However, in many applications only few training samples are available. In this work, we tackle the issue of training neural networks for classification task when fe w training samples are available. We attempt to solve this issue by proposing a new regularization term that constrains the hidden layers of a network to learn class-wise invariant representations. In our regularization framework, learning invariant representations is generalized to the class membership where samples with the same class should have the same representation. Numerical experiments over MNIST and its variants showed that our proposal helps improving the generalization of neural network particularly when trained with few samples. We provide the source code of our framework https://github.com/sbelharbi/learning-class-invariant-features .
The severe crowding towards the Galactic plane suggests that the census of nearby stars in that direction may be incomplete. Recently, Scholz reported a new M9 object at an estimated distance d~7 pc (WISE J072003.20-084651.2; hereafter WISE0720) at G alactic latitude b=2.3 degr. Our goals are to determine the physical characteristics of WISE0720, its kinematic properties, and to address the question if it is a binary object, as suggested in the discovery paper. Optical and infrared spectroscopy from the Southern African Large Telescope and Magellan, respectively, and spectral energy distribution fitting were used to determine the spectral type of WISE0720. The measured radial velocity, proper motion and parallax yielded its Galactic velocities. We also investigated if WISE0720 may show X-ray activity based on archival data. Our spectra are consistent with spectral type L0+/-1. We find no evidence for binarity, apart for a minor 2-sigma level difference in the radial velocities taken at two different epochs. The spatial velocity of WISE0720 does not connect it to any known moving group, instead it places the object with high probability in the old thin disk or in the thick disk. The spectral energy distribution fit hints at excess in the 12 and 22 micron WISE bands which may be due to a redder companion, but the same excess is visible in other late type objects, and it more likely implies a shortcoming of the models (e.g., issues with the effective wavelengths of the filters for these extremely cool objects, etc.) rather than a disk or redder companion. The optical spectrum shows some Halpha emission, indicative of stellar activity. Archival X-ray observations yield no detection.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا