ترغب بنشر مسار تعليمي؟ اضغط هنا

Attention: to Better Stand on the Shoulders of Giants

103   0   0.0 ( 0 )
 نشر من قبل Zhou Shou
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Science of science (SciSci) is an emerging discipline wherein science is used to study the structure and evolution of science itself using large data sets. The increasing availability of digital data on scholarly outcomes offers unprecedented opportunities to explore SciSci. In the progress of science, the previously discovered knowledge principally inspires new scientific ideas, and citation is a reasonably good reflection of this cumulative nature of scientific research. The researches that choose potentially influential references will have a lead over the emerging publications. Although the peer review process is the mainly reliable way of predicting a papers future impact, the ability to foresee the lasting impact based on citation records is increasingly essential in the scientific impact analysis in the era of big data. This paper develops an attention mechanism for the long-term scientific impact prediction and validates the method based on a real large-scale citation data set. The results break conventional thinking. Instead of accurately simulating the original power-law distribution, emphasizing the limited attention can better stand on the shoulders of giants.



قيم البحث

اقرأ أيضاً

134 - Massimo Franceschet 2010
PageRank is a Web page ranking technique that has been a fundamental ingredient in the development and success of the Google search engine. The method is still one of the many signals that Google uses to determine which pages are most important. The main idea behind PageRank is to determine the importance of a Web page in terms of the importance assigned to the pages hyperlinking to it. In fact, this thesis is not new, and has been previously successfully exploited in different contexts. We review the PageRank method and link it to some renowned previous techniques that we have found in the fields of Web information retrieval, bibliometrics, sociometry, and econometrics.
Entity linking is a standard component in modern retrieval system that is often performed by third-party toolkits. Despite the plethora of open source options, it is difficult to find a single system that has a modular architecture where certain comp onents may be replaced, does not depend on external sources, can easily be updated to newer Wikiped
Convolution exploits locality for efficiency at a cost of missing long range context. Self-attention has been adopted to augment CNNs with non-local interactions. Recent works prove it possible to stack self-attention layers to obtain a fully attenti onal network by restricting the attention to a local region. In this paper, we attempt to remove this constraint by factorizing 2D self-attention into two 1D self-attentions. This reduces computation complexity and allows performing attention within a larger or even global region. In companion, we also propose a position-sensitive self-attention design. Combining both yields our position-sensitive axial-attention layer, a novel building block that one could stack to form axial-attention models for image classification and dense prediction. We demonstrate the effectiveness of our model on four large-scale datasets. In particular, our model outperforms all existing stand-alone self-attention models on ImageNet. Our Axial-DeepLab improves 2.8% PQ over bottom-up state-of-the-art on COCO test-dev. This previous state-of-the-art is attained by our small variant that is 3.8x parameter-efficient and 27x computation-efficient. Axial-DeepLab also achieves state-of-the-art results on Mapillary Vistas and Cityscapes.
Halo stars orbit within the potential of the Milky Way and hence their kinematics can be used to understand the underlying mass distribution. However, the inferred mass distribution depends sensitively upon assumptions made on the density and the vel ocity anisotropy profiles of the tracers. Also, there is a degeneracy between the parameters of the halo and that of the disk or bulge. Here, we decompose the Galaxy into bulge, disk and dark matter halo and then model the kinematic data of the halo BHB and K-giants from the SEGUE. Additionally, we use the gas terminal velocity curve and the Sgr A$^*$ proper motion. With $R_odot = 8.5$kpc, our study reveals that the density of the stellar halo has a break at $17.2^{+1.1}_{-1.0}$ kpc, and an exponential cut-off in the outer parts starting at $97.7^{+15.6}_{-15.8}$kpc. Also, we find the velocity anisotropy is radially biased with $beta_s= 0.4pm{0.2}$ in the outer halo. We measure halo virial mass $M_{text{vir}} = 0.80^{+0.31}_{-0.16} times 10^{12} M_{odot}$, concentration $c=21.1^{+14.8}_{-8.3}$, disk mass of $0.95^{+0.24}_{-0.30}times10^{11} M_{odot}$, disk scale length of $4.9^{+0.4}_{-0.4}$ kpc and bulge mass of $0.91^{+0.31}_{-0.38} times 10^{10} M_{odot}$. The mass of halo is found to be small and this has important consequences. The giant stars reveal that the outermost halo stars have low velocity dispersion interestingly suggesting a truncation of the stellar halo density rather than a small overall mass of the Galaxy. Our estimates of local escape velocity $v_{rm esc} = 550.9^{+32.4}_{-22.1}$ kms$^{-1}$ and dark matter density $rho^{rm DM}_{odot} = 0.0088^{+0.0024}_{-0.0018} M_{odot} {rm pc^{-3}} $ ($0.35^{+0.08}_{-0.07}$ GeV cm$^{-3}$) are in good agreement with recent estimates. Some of the above estimates are depended on the adopted value of $R_odot$ and of outer power-law index of the tracer number density.
As an essential ingredient of modern deep learning, attention mechanism, especially self-attention, plays a vital role in the global correlation discovery. However, is hand-crafted attention irreplaceable when modeling the global context? Our intrigu ing finding is that self-attention is not better than the matrix decomposition (MD) model developed 20 years ago regarding the performance and computational cost for encoding the long-distance dependencies. We model the global context issue as a low-rank recovery problem and show that its optimization algorithms can help design global information blocks. This paper then proposes a series of Hamburgers, in which we employ the optimization algorithms for solving MDs to factorize the input representations into sub-matrices and reconstruct a low-rank embedding. Hamburgers with different MDs can perform favorably against the popular global context module self-attention when carefully coping with gradients back-propagated through MDs. Comprehensive experiments are conducted in the vision tasks where it is crucial to learn the global context, including semantic segmentation and image generation, demonstrating significant improvements over self-attention and its variants.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا