No Arabic abstract
Fast IPv6 scanning is challenging in the field of network measurement as it requires exploring the whole IPv6 address space but limited by current computational power. Researchers propose to obtain possible active target candidate sets to probe by algorithmically analyzing the active seed sets. However, IPv6 addresses lack semantic information and contain numerous addressing schemes, leading to the difficulty of designing effective algorithms. In this paper, we introduce our approach 6VecLM to explore achieving such target generation algorithms. The architecture can map addresses into a vector space to interpret semantic relationships and uses a Transformer network to build IPv6 language models for predicting address sequence. Experiments indicate that our approach can perform semantic classification on address space. By adding a new generation approach, our model possesses a controllable word innovation capability compared to conventional language models. The work outperformed the state-of-the-art target generation algorithms on two active address datasets by reaching more quality candidate sets.
Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell. In this paper, we extend the search space of NAS. In particular, we present a general approach to learn both intra-cell and inter-cell architectures (call it ESS). For a better search result, we design a joint learning method to perform intra-cell and inter-cell NAS simultaneously. We implement our model in a differentiable architecture search system. For recurrent neural language modeling, it outperforms a strong baseline significantly on the PTB and WikiText data, with a new state-of-the-art on PTB. Moreover, the learned architectures show good transferability to other systems. E.g., they improve state-of-the-art systems on the CoNLL and WNUT named entity recognition (NER) tasks and CoNLL chunking task, indicating a promising line of research on large-scale pre-learned architectures.
Generative modeling for protein engineering is key to solving fundamental problems in synthetic biology, medicine, and material science. We pose protein engineering as an unsupervised sequence generation problem in order to leverage the exponentially growing set of proteins that lack costly, structural annotations. We train a 1.2B-parameter language model, ProGen, on ~280M protein sequences conditioned on taxonomic and keyword tags such as molecular function and cellular component. This provides ProGen with an unprecedented range of evolutionary sequence diversity and allows it to generate with fine-grained control as demonstrated by metrics based on primary sequence similarity, secondary structure accuracy, and conformational energy.
Services on the public Internet are frequently scanned, then subject to brute-force and denial-of-service attacks. We would like to run such services stealthily, available to friends but hidden from adversaries. In this work, we propose a moving target defense named Chhoyhopper that utilizes the vast IPv6 address space to conceal publicly available services. The client and server to hop to different IPv6 addresses in a pattern based on a shared, pre-distributed secret and the time-of-day. By hopping over a /64 prefix, services cannot be found by active scanners, and passively observed information is useless after two minutes. We demonstrate our system with SSH, and show that it can be extended to other applications.
With the recent success of the pre-training technique for NLP and image-linguistic tasks, some video-linguistic pre-training works are gradually developed to improve video-text related downstream tasks. However, most of the existing multimodal models are pre-trained for understanding tasks, leading to a pretrain-finetune discrepancy for generation tasks. This paper proposes UniVL: a Unified Video and Language pre-training model for both multimodal understanding and generation. It comprises four components, including two single-modal encoders, a cross encoder, and a decoder with the Transformer backbone. Five objectives, including video-text joint, conditioned masked language model (CMLM), conditioned masked frame model (CMFM), video-text alignment, and language reconstruction, are designed to train each of the components. We further develop two pre-training strategies, stage by stage pre-training (StagedP) and enhanced video representation (EnhancedV), to make the training process of the UniVL more effective. The pre-train is carried out on a sizeable instructional video dataset HowTo100M. Experimental results demonstrate that the UniVL can learn strong video-text representation and achieves state-of-the-art results on five downstream tasks.
Since January 2011, IPv4 address space has exhausted and IPv6 is taking up the place as successor. Coexistence of IPv4 and IPv6 bears problem of incompatibility, as IPv6 and IPv4 headers are different from each other, thus, cannot interoperate with each other directly. The IPv6 transitioning techniques are still not mature, causing hindrance in the deployment of IPv6 and development of next generation Internet. Until IPv6 completely takes over from IPv4, they will both coexist. For IPv4-IPv6 coexistence, three solutions are possible: a) making every device dual stack, b) translation, c) tunneling. Tunneling stands out as the best possible solution. Among the IPv6 tunneling techniques, this paper evaluates the impact of three recent IPv6 tunneling techniques: 6to4, Teredo, and ISATAP, in cloud virtualization environment. In virtual networks, these protocols were implemented on Microsoft Windows (MS Windows 7 and MS Windows Server 2008) and Linux operating system. Each protocol was implemented on the virtual network. UDP audio streaming, video streaming and ICMP-ping traffic was run. Multiple runs of traffic were routed over the setup for each protocol. The average of the data was taken to generate graphs and final results. The performance of these tunneling techniques has been evaluated on eight parameters, namely: throughput, end to end delay (E2ED), jitter, round trip time (RTT), tunneling overhead, tunnel setup delay, query delay, and auxiliary devices required. This evaluation shows the impact of IPv4-IPv6 coexistence in virtualization environment for cloud computing.