ترغب بنشر مسار تعليمي؟ اضغط هنا

Graph representation learning has achieved great success in many areas, including e-commerce, chemistry, biology, etc. However, the fundamental problem of choosing the appropriate dimension of node embedding for a given graph still remains unsolved. The commonly used strategies for Node Embedding Dimension Selection (NEDS) based on grid search or empirical knowledge suffer from heavy computation and poor model performance. In this paper, we revisit NEDS from the perspective of minimum entropy principle. Subsequently, we propose a novel Minimum Graph Entropy (MinGE) algorithm for NEDS with graph data. To be specific, MinGE considers both feature entropy and structure entropy on graphs, which are carefully designed according to the characteristics of the rich information in them. The feature entropy, which assumes the embeddings of adjacent nodes to be more similar, connects node features and link topology on graphs. The structure entropy takes the normalized degree as basic unit to further measure the higher-order structure of graphs. Based on them, we design MinGE to directly calculate the ideal node embedding dimension for any graph. Finally, comprehensive experiments with popular Graph Neural Networks (GNNs) on benchmark datasets demonstrate the effectiveness and generalizability of our proposed MinGE.
Pre-training models such as BERT have achieved great success in many natural language processing tasks. However, how to obtain better sentence representation through these pre-training models is still worthy to exploit. Previous work has shown that t he anisotropy problem is an critical bottleneck for BERT-based sentence representation which hinders the model to fully utilize the underlying semantic features. Therefore, some attempts of boosting the isotropy of sentence distribution, such as flow-based model, have been applied to sentence representations and achieved some improvement. In this paper, we find that the whitening operation in traditional machine learning can similarly enhance the isotropy of sentence representations and achieve competitive results. Furthermore, the whitening technique is also capable of reducing the dimensionality of the sentence representation. Our experimental results show that it can not only achieve promising performance but also significantly reduce the storage cost and accelerate the model retrieval speed.
137 - Rui Li , Jianlin Su , Chenxi Duan 2020
In this paper, to remedy this deficiency, we propose a Linear Attention Mechanism which is approximate to dot-product attention with much less memory and computational costs. The efficient design makes the incorporation between attention mechanisms a nd neural networks more flexible and versatile. Experiments conducted on semantic segmentation demonstrated the effectiveness of linear attention mechanism. Code is available at https://github.com/lironui/Linear-Attention-Mechanism.
64 - Rui Li , Yiping Shu , Jianlin Su 2018
More than one hundred galaxy-scale strong gravitational lens systems have been found by searching for the emission lines coming from galaxies with redshifts higher than the lens galaxies. Based on this spectroscopic-selection method, we introduce the deep Residual Networks (ResNet, a kind of deep Convolutional Neural Networks) to search for the galaxy-Ly$alpha$ emitter (LAE) lens candidates by recognizing the Ly$alpha$ emission lines coming from high redshift galaxies ($2 < z < 3$) in the spectra of early-type galaxies (ETGs) at middle redshift ($zsim 0.5$). The spectra of the ETGs come from the Data Release 12 (DR12) of the Baryon Oscillation Spectroscopic Survey (BOSS) of the Sloan Digital Sky Survey uppercaseexpandafter{romannumeral3} (SDSS-uppercaseexpandafter{romannumeral3}). In this paper, we first build a 28 layers ResNet model, and then artificially synthesize 150,000 training spectra, including 140,000 spectra without Ly$alpha$ lines and 10,000 ones with Ly$alpha$ lines, to train the networks. After 20 training epochs, we obtain a near-perfect test accuracy at 0.9954. The corresponding loss is 0.0028 and the completeness is 93.6%. We finally apply our ResNet model to our predictive data with 174 known lens candidates. We obtain 1232 hits including 161 of the 174 known candidates (92.5% discovery rate). Apart from the hits found in other works, our ResNet model also find 536 new hits. We then perform several subsequent selections on these 536 hits and present 5 most believable lens candidates.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا