ترغب بنشر مسار تعليمي؟ اضغط هنا

Now You See Me (CME): Concept-based Model Extraction

129   0   0.0 ( 0 )
 نشر من قبل Dmitry Kazhdan
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Deep Neural Networks (DNNs) have achieved remarkable performance on a range of tasks. A key step to further empowering DNN-based approaches is improving their explainability. In this work we present CME: a concept-based model extraction framework, used for analysing DNN models via concept-based extracted models. Using two case studies (dSprites, and Caltech UCSD Birds), we demonstrate how CME can be used to (i) analyse the concept information learned by a DNN model (ii) analyse how a DNN uses this concept information when predicting output labels (iii) identify key concept information that can further improve DNN predictive performance (for one of the case studies, we showed how model accuracy can be improved by over 14%, using only 30% of the available concepts).



قيم البحث

اقرأ أيضاً

92 - Mariano Mendez 2006
I study the behaviour of the maximum rms fractional amplitude, $r_{rm max}$ and the maximum coherence, $Q_{rm max}$, of the kilohertz quasi-periodic oscillations (kHz QPOs) in a dozen low-mass X-ray binaries. I find that: (i) The maximum rms amplitud es of the lower and the upper kHz QPO, $r^{ell}_{rm max}$ and $r^{rm u}_{rm max}$, respectively, decrease more or less exponentially with increasing luminosity of the source; (ii) the maximum coherence of the lower kHz QPO, $Q^{ell}_{rm max}$, first increases and then decreases exponentially with luminosity; (iii) the maximum coherence of the upper kHz QPO, $Q^{rm u}_{rm max}$, is more or less independent of luminosity; and (iv) $r_{rm max}$ and $Q_{rm max}$ show the opposite behaviour with hardness of the source, consistent with the fact that there is a general anticorrelation between luminosity and spectral hardness in these sources. Both $r_{rm max}$ and $Q_{rm max}$ in the sample of sources, and the rms amplitude and coherence of the kHz QPOs in individual sources show a similar behaviour with hardness. This similarity argues against the interpretation that the drop of coherence and rms amplitude of the lower kHz QPO at high QPO frequencies in individual sources is a signature of the innermost stable circular orbit around a neutron star.
108 - M. J.Coe 2007
Multiwavelength observations are reported here of the Be/X-ray binary pulsar system GRO J1008-57. Over ten years worth of data are gathered together to show that the periodic X-ray outbursts are dependant on both the binary motion and the size of the circumstellar disk. In the first instance an accurate orbital solution is determined from pulse periods, and in the second case the strength and shape of the Halpha emission line is shown to be a valuable indicator of disk size and its behaviour. Furthermore, the shape of the emission line permits a direct determination of the disk size which is in good agreement with theoretical estimates. A detailed study of the pulse period variations during outbursts determined the binary period to be 247.8, in good agreement with the period determined from the recurrence of the outbursts.
We report the discovery of a new changing-look quasar, SDSS J101152.98+544206.4, through repeat spectroscopy from the Time Domain Spectroscopic Survey. This is an addition to a small but growing set of quasars whose blue continua and broad optical em ission lines have been observed to decline by a large factor on a time scale of approximately a decade. The 5100 Angstrom monochromatic continuum luminosity of this quasar drops by a factor of > 9.8 in a rest-frame time interval of < 9.7 years, while the broad H-alpha luminosity drops by a factor of 55 in the same amount of time. The width of the broad H-alpha line increases in the dim state such that the black hole mass derived from the appropriate single-epoch scaling relation agrees between the two epochs within a factor of 3. The fluxes of the narrow emission lines do not appear to change between epochs. The light curve obtained by the Catalina Sky Survey suggests that the transition occurs within a rest-frame time interval of approximately 500 days. We examine three possible mechanisms for this transition suggested in the recent literature. An abrupt change in the reddening towards the central engine is disfavored by the substantial difference between the timescale to obscure the central engine and the observed timescale of the transition. A decaying tidal disruption flare is consistent with the decay rate of the light curve but not with the prolonged bright state preceding the decay, nor can this scenario provide the power required by the luminosities of the emission lines. An abrupt drop in the accretion rate onto the supermassive black hole appears to be the most plausible explanation for the rapid dimming.
Concept-based explanations have emerged as a popular way of extracting human-interpretable representations from deep discriminative models. At the same time, the disentanglement learning literature has focused on extracting similar representations in an unsupervised or weakly-supervised way, using deep generative models. Despite the overlapping goals and potential synergies, to our knowledge, there has not yet been a systematic comparison of the limitations and trade-offs between concept-based explanations and disentanglement approaches. In this paper, we give an overview of these fields, comparing and contrasting their properties and behaviours on a diverse set of tasks, and highlighting their potential strengths and limitations. In particular, we demonstrate that state-of-the-art approaches from both classes can be data inefficient, sensitive to the specific nature of the classification/regression task, or sensitive to the employed concept representation.
Model extraction increasingly attracts research attentions as keeping commercial AI models private can retain a competitive advantage. In some scenarios, AI models are trained proprietarily, where neither pre-trained models nor sufficient in-distribu tion data is publicly available. Model extraction attacks against these models are typically more devastating. Therefore, in this paper, we empirically investigate the behaviors of model extraction under such scenarios. We find the effectiveness of existing techniques significantly affected by the absence of pre-trained models. In addition, the impacts of the attackers hyperparameters, e.g. model architecture and optimizer, as well as the utilities of information retrieved from queries, are counterintuitive. We provide some insights on explaining the possible causes of these phenomena. With these observations, we formulate model extraction attacks into an adaptive framework that captures these factors with deep reinforcement learning. Experiments show that the proposed framework can be used to improve existing techniques, and show that model extraction is still possible in such strict scenarios. Our research can help system designers to construct better defense strategies based on their scenarios.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا