ترغب بنشر مسار تعليمي؟ اضغط هنا

421 - Qun Hao , Wenli Wang , Yao Hu 2021
Vortex beams with orbital angular momentum has been attracting tremendous attention due to their considerable applications ranging from optical tweezers to quantum information processing. Metalens, an ultra-compact and multifunctional device, provide a desired platform for designing vortex beams. A spin-dependent metalens can boost the freedom to further satisfy practical applications. By combining geometric phase and propagation phase, we propose and demonstrate an approach to design a spin-dependent metalens generating dual-focused vortex beams along longitudinal or transverse direction, i.e., metalenses with predesigned spin-dependent phase profiles. Under the illumination of an elliptical polarization incident beam, two spin-dependent focused vortex beams can be observed, and the relative focal intensity of them can be easily adjusted by modulating the ellipticity of the incident beam. Moreover, we also demonstrated that the separate distance between these dual-focused beams and their topological charges could be simultaneously tailored at will, which may have a profound impact on optical trapping and manipulation in photonics.
In this paper, we consider the design of a multiple-input multiple-output (MIMO) transmitter which simultaneously functions as a MIMO radar and a base station for downlink multiuser communications. In addition to a power constraint, we require the co variance of the transmit waveform be equal to a given optimal covariance for MIMO radar, to guarantee the radar performance. With this constraint, we formulate and solve the signal-to-interference-plus-noise ratio (SINR) balancing problem for multiuser transmit beamforming via convex optimization. Considering that the interference cannot be completely eliminated with this constraint, we introduce dirty paper coding (DPC) to further cancel the interference, and formulate the SINR balancing and sum rate maximization problem in the DPC regime. Although both of the two problems are non-convex, we show that they can be reformulated to convex optimizations via the Lagrange and downlink-uplink duality. In addition, we propose gradient projection based algorithms to solve the equivalent dual problem of SINR balancing, in both transmit beamforming and DPC regimes. The simulation results demonstrate significant performance improvement of DPC over transmit beamforming, and also indicate that the degrees of freedom for the communication transmitter is restricted by the rank of the covariance.
Assume that $mathbb F$ is an algebraically closed field with characteristic zero. The universal Racah algebra $Re$ is a unital associative $mathbb F$-algebra generated by $A,B,C,D$ and the relations state that $[A,B]=[B,C]=[C,A]=2D$ and each of $$ [A ,D]+AC-BA, qquad [B,D]+BA-CB, qquad [C,D]+CB-AC $$ is central in $Re$. The universal additive DAHA (double affine Hecke algebra) $mathfrak H$ of type $(C_1^vee,C_1)$ is a unital associative $mathbb F$-algebra generated by ${t_i}_{i=0}^3$ and the relations state that begin{gather*} t_0+t_1+t_2+t_3 = -1, hbox{$t_i^2$ is central for all $i=0,1,2,3$}. end{gather*} Any $mathfrak H$-module can be considered as a $Re$-module via the $mathbb F$-algebra homomorphism $Reto mathfrak H$ given by begin{eqnarray*} A &mapsto & frac{(t_0+t_1-1)(t_0+t_1+1)}{4}, B &mapsto & frac{(t_0+t_2-1)(t_0+t_2+1)}{4}, C &mapsto & frac{(t_0+t_3-1)(t_0+t_3+1)}{4}. end{eqnarray*} Let $V$ denote a finite-dimensional irreducible $mathfrak H$-module. In this paper we show that $A,B,C$ are diagonalizable on $V$ if and only if $A,B,C$ act as Leonard triples on all composition factors of the $Re$-module $V$.
150 - Wentian Zhao , Yao Hu , Heda Wang 2021
Entity-aware image captioning aims to describe named entities and events related to the image by utilizing the background knowledge in the associated article. This task remains challenging as it is difficult to learn the association between named ent ities and visual cues due to the long-tail distribution of named entities. Furthermore, the complexity of the article brings difficulty in extracting fine-grained relationships between entities to generate informative event descriptions about the image. To tackle these challenges, we propose a novel approach that constructs a multi-modal knowledge graph to associate the visual objects with named entities and capture the relationship between entities simultaneously with the help of external knowledge collected from the web. Specifically, we build a text sub-graph by extracting named entities and their relationships from the article, and build an image sub-graph by detecting the objects in the image. To connect these two sub-graphs, we propose a cross-modal entity matching module trained using a knowledge base that contains Wikipedia entries and the corresponding images. Finally, the multi-modal knowledge graph is integrated into the captioning model via a graph attention mechanism. Extensive experiments on both GoodNews and NYTimes800k datasets demonstrate the effectiveness of our method.
Electric fields can spontaneously decay via the Schwinger effect, the nucleation of a charged particle-anti particle pair separated by a critical distance $d$. What happens if the available distance is smaller than $d$? Previous work on this question has produced contradictory results. Here, we study the quantum evolution of electric fields when the field points in a compact direction with circumference $L < d$ using the massive Schwinger model, quantum electrodynamics in one space dimension with massive charged fermions. We uncover a new and previously unknown set of instantons that result in novel physics that disagrees with all previous estimates. In parameter regimes where the field value can be well-defined in the quantum theory, generic initial fields $E$ are in fact stable and do not decay, while initial values that are quantized in half-integer units of the charge $E = (k/2) g$ with $kin mathbb Z$ oscillate in time from $+(k/2) g$ to $-(k/2) g$, with exponentially small probability of ever taking any other value. We verify our results with four distinct techniques: numerically by measuring the decay directly in Lorentzian time on the lattice, numerically using the spectrum of the Hamiltonian, numerically and semi-analytically using the bosonized description of the Schwinger model, and analytically via our instanton estimate.
Dual function radar communications (DFRC) systems are attractive technologies for autonomous vehicles, which utilize electromagnetic waves to constantly sense the environment while simultaneously communicating with neighbouring devices. An emerging a pproach to implement DFRC systems is to embed information in radar waveforms via index modulation (IM). Implementation of DFRC schemes in vehicular systems gives rise to strict constraints in terms of cost, power efficiency, and hardware complexity. In this paper, we extend IM-based DFRC systems to utilize sparse arrays and frequency modulated continuous waveforms (FMCWs), which are popular in automotive radar for their simplicity and low hardware complexity. The proposed FMCW-based radar-communications system (FRaC) operates at reduced cost and complexity by transmitting with a reduced number of radio frequency modules, combined with narrowband FMCW signalling. This is achieved via array sparsification in transmission, formulating a virtual multiple-input multiple-output array by combining the signals in one coherent processing interval, in which the narrowband waveforms are transmitted in a randomized manner. Performance analysis and numerical results show that the proposed radar scheme achieves similar resolution performance compared with a wideband radar system operating with a large receive aperture, while requiring less hardware overhead. For the communications subsystem, FRaC achieves higher rates and improved error rates compared to dual-function signalling based on conventional phase modulation.
Temporal action detection (TAD) aims to determine the semantic label and the boundaries of every action instance in an untrimmed video. It is a fundamental and challenging task in video understanding and significant progress has been made. Previous m ethods involve multiple stages or networks and hand-designed rules or operations, which fall short in efficiency and flexibility. In this paper, we propose an end-to-end framework for TAD upon Transformer, termed textit{TadTR}, which maps a set of learnable embeddings to action instances in parallel. TadTR is able to adaptively extract temporal context information required for making action predictions, by selectively attending to a sparse set of snippets in a video. As a result, it simplifies the pipeline of TAD and requires lower computation cost than previous detectors, while preserving remarkable detection performance. TadTR achieves state-of-the-art performance on HACS Segments (+3.35% average mAP). As a single-network detector, TadTR runs 10$times$ faster than its comparable competitor. It outperforms existing single-network detectors by a large margin on THUMOS14 (+5.0% average mAP) and ActivityNet (+7.53% average mAP). When combined with other detectors, it reports 54.1% mAP at IoU=0.5 on THUMOS14, and 34.55% average mAP on ActivityNet-1.3. Our code will be released at url{https://github.com/xlliu7/TadTR}.
Although galactic winds play a critical role in regulating galaxy formation, hydrodynamic cosmological simulations do not resolve the scales that govern the interaction between winds and the ambient circumgalactic medium (CGM). We implement the Physi cally Evolved Wind (PhEW) model of Huang et al. (2020) in the GIZMO hydrodynamics code and perform test cosmological simulations with different choices of model parameters and numerical resolution. PhEW adopts an explicit subgrid model that treats each wind particle as a collection of clouds that exchange mass, metals, and momentum with their surroundings and evaporate by conduction and hydrodynamic instabilities as calibrated on much higher resolution cloud scale simulations. In contrast to a conventional wind algorithm, we find that PhEW results are robust to numerical resolution and implementation details because the small scale interactions are defined by the model itself. Compared to conventional wind simulations with the same resolution, our PhEW simulations produce similar galaxy stellar mass functions at $zgeq 1$ but are in better agreement with low-redshift observations at $M_* < 10^{11}M_odot$ because PhEW particles shed mass to the CGM before escaping low mass halos. PhEW radically alters the CGM metal distribution because PhEW particles disperse metals to the ambient medium as their clouds dissipate, producing a CGM metallicity distribution that is skewed but unimodal and is similar between cold and hot gas. While the temperature distributions and radial profiles of gaseous halos are similar in simulations with PhEW and conventional winds, these changes in metal distribution will affect their predicted UV/X-ray properties in absorption and emission.
Energy efficiency is of critical importance to trajectory planning for UAV swarms in obstacle avoidance. In this paper, we present $E^2Coop$, a new scheme designed to avoid collisions for UAV swarms by tightly coupling Artificial Potential Field (APF ) with Particle Swarm Planning (PSO) based trajectory planning. In $E^2Coop$, swarm members perform trajectory planning cooperatively to avoid collisions in an energy-efficient manner. $E^2Coop$ exploits the advantages of the active contour model in image processing for trajectory planning. Each swarm member plans its trajectories on the contours of the environment field to save energy and avoid collisions to obstacles. Swarm members that fall within the safeguard distance of each other plan their trajectories on different contours to avoid collisions with each other. Simulation results demonstrate that $E^2Coop$ can save energy up to 51% compared with two state-of-the-art schemes.
In this paper, we propose a subspace representation learning (SRL) framework to tackle few-shot image classification tasks. It exploits a subspace in local CNN feature space to represent an image, and measures the similarity between two images accord ing to a weighted subspace distance (WSD). When K images are available for each class, we develop two types of template subspaces to aggregate K-shot information: the prototypical subspace (PS) and the discriminative subspace (DS). Based on the SRL framework, we extend metric learning based techniques from vector to subspace representation. While most previous works adopted global vector representation, using subspace representation can effectively preserve the spatial structure, and diversity within an image. We demonstrate the effectiveness of the SRL framework on three public benchmark datasets: MiniImageNet, TieredImageNet and Caltech-UCSD Birds-200-2011 (CUB), and the experimental results illustrate competitive/superior performance of our method compared to the previous state-of-the-art.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا