ترغب بنشر مسار تعليمي؟ اضغط هنا

Efficient scheduling of transmissions is a key problem in wireless networks. The main challenge stems from the fact that optimal link scheduling involves solving a maximum weighted independent set (MWIS) problem, which is known to be NP-hard. For pra ctical link scheduling schemes, centralized and distributed greedy heuristics are commonly used to approximate the solution to the MWIS problem. However, these greedy schemes mostly ignore important topological information of the wireless network. To overcome this limitation, we propose fast heuristics based on graph convolutional networks (GCNs) that can be implemented in centralized and distributed manners. Our centralized MWIS solver is based on tree search guided by a trainable GCN module and 1-step rollout. In our distributed MWIS solver, a trainable GCN module learns topology-aware node embeddings that are combined with the network weights before calling a distributed greedy solver. Test results on medium-sized wireless networks show that a GCN-based centralized MWIS solver can reach a near-optimal solution quickly. Moreover, we demonstrate that a shallow GCN-based distributed MWIS scheduler can reduce by nearly half the suboptimality gap of the distributed greedy solver with minimal increase in complexity. The proposed scheduling solutions also exhibit good generalizability across graph and weight distributions.
Sampling algorithms based on discretizations of Stochastic Differential Equations (SDEs) compose a rich and popular subset of MCMC methods. This work provides a general framework for the non-asymptotic analysis of sampling error in 2-Wasserstein dist ance, which also leads to a bound of mixing time. The method applies to any consistent discretization of contractive SDEs. When applied to Langevin Monte Carlo algorithm, it establishes $tilde{mathcal{O}}left( frac{sqrt{d}}{epsilon} right)$ mixing time, without warm start, under the common log-smooth and log-strongly-convex conditions, plus a growth condition on the 3rd-order derivative of the potential of target measures at infinity. This bound improves the best previously known $tilde{mathcal{O}}left( frac{d}{epsilon} right)$ result and is optimal (in terms of order) in both dimension $d$ and accuracy tolerance $epsilon$ for target measures satisfying the aforementioned assumptions. Our theoretical analysis is further validated by numerical experiments.
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent and thus has a slow convergence. In addition, softmax, as a decision layer, may ignore the distribution information of the data during classification. Aiming to tackle the referred problems, we propose a novel manifold neural network based on non-gradient optimization, i.e., the closed-form solutions. Considering that the activation function is generally invertible, we reconstruct the network via forward ridge regression and low rank backward approximation, which achieve the rapid convergence. Moreover, by unifying the flexible Stiefel manifold and adaptive support vector machine, we devise the novel decision layer which efficiently fits the manifold structure of the data and label information. Consequently, a jointly non-gradient optimization method is designed to generate the network with closed-form results. Eventually, extensive experiments validate the superior performance of the model.
We propose a novel learning framework using neural mean-field (NMF) dynamics for inference and estimation problems on heterogeneous diffusion networks. Our new framework leverages the Mori-Zwanzig formalism to obtain an exact evolution equation of th e individual node infection probabilities, which renders a delay differential equation with memory integral approximated by learnable time convolution operators. Directly using information diffusion cascade data, our framework can simultaneously learn the structure of the diffusion network and the evolution of node infection probabilities. Connections between parameter learning and optimal control are also established, leading to a rigorous and implementable algorithm for training NMF. Moreover, we show that the projected gradient descent method can be employed to solve the challenging influence maximization problem, where the gradient is computed extremely fast by integrating NMF forward in time just once in each iteration. Extensive empirical studies show that our approach is versatile and robust to variations of the underlying diffusion network models, and significantly outperform existing approaches in accuracy and efficiency on both synthetic and real-world data.
The query-based black-box attacks, which dont require any knowledge about the attacked models and datasets, have raised serious threats to machine learning models in many real applications. In this work, we study a simple but promising defense techni que, dubbed Random Noise Defense (RND) against query-based black-box attacks, which adds proper Gaussian noise to each query. It is lightweight and can be directly combined with any off-the-shelf models and other defense strategies. However, the theoretical guarantee of random noise defense is missing, and the actual effectiveness of this defense is not yet fully understood. In this work, we present solid theoretical analyses to demonstrate that the defense effect of RND against the query-based black-box attack and the corresponding adaptive attack heavily depends on the magnitude ratio between the random noise added by the defender (i.e., RND) and the random noise added by the attacker for gradient estimation. Extensive experiments on CIFAR-10 and ImageNet verify our theoretical studies. Based on RND, we also propose a stronger defense method that combines RND with Gaussian augmentation training (RND-GT) and achieves better defense performance.
We propose a novel framework for modeling multiple multivariate point processes, each with heterogeneous event types that share an underlying space and obey the same generative mechanism. Focusing on Hawkes processes and their variants that are assoc iated with Granger causality graphs, our model leverages an uncountable event type space and samples the graphs with different sizes from a nonparametric model called {it graphon}. Given those graphs, we can generate the corresponding Hawkes processes and simulate event sequences. Learning this graphon-based Hawkes process model helps to 1) infer the underlying relations shared by different Hawkes processes; and 2) simulate event sequences with different event types but similar dynamics. We learn the proposed model by minimizing the hierarchical optimal transport distance between the generated event sequences and the observed ones, leading to a novel reward-augmented maximum likelihood estimation method. We analyze the properties of our model in-depth and demonstrate its rationality and effectiveness in both theory and experiments.
A fundamental problem in the design of wireless networks is to efficiently schedule transmission in a distributed manner. The main challenge stems from the fact that optimal link scheduling involves solving a maximum weighted independent set (MWIS) p roblem, which is NP-hard. For practical link scheduling schemes, distributed greedy approaches are commonly used to approximate the solution of the MWIS problem. However, these greedy schemes mostly ignore important topological information of the wireless networks. To overcome this limitation, we propose a distributed MWIS solver based on graph convolutional networks (GCNs). In a nutshell, a trainable GCN module learns topology-aware node embeddings that are combined with the network weights before calling a greedy solver. In small- to middle-sized wireless networks with tens of links, even a shallow GCN-based MWIS scheduler can leverage the topological information of the graph to reduce in half the suboptimality gap of the distributed greedy solver with good generalizability across graphs and minimal increase in complexity.
We propose a novel learning framework based on neural mean-field dynamics for inference and estimation problems of diffusion on networks. Our new framework is derived from the Mori-Zwanzig formalism to obtain an exact evolution of the node infection probabilities, which renders a delay differential equation with memory integral approximated by learnable time convolution operators, resulting in a highly structured and interpretable RNN. Directly using cascade data, our framework can jointly learn the structure of the diffusion network and the evolution of infection probabilities, which are cornerstone to important downstream applications such as influence maximization. Connections between parameter learning and optimal control are also established. Empirical study shows that our approach is versatile and robust to variations of the underlying diffusion network models, and significantly outperform existing approaches in accuracy and efficiency on both synthetic and real-world data.
Generative Adversarial Networks (GANs) have achieved great success in unsupervised learning. Despite the remarkable empirical performance, there are limited theoretical understandings on the statistical properties of GANs. This paper provides statist ical guarantees of GANs for the estimation of data distributions which have densities in a H{o}lder space. Our main result shows that, if the generator and discriminator network architectures are properly chosen (universally for all distributions with H{o}lder densities), GANs are consistent estimators of the data distributions under strong discrepancy metrics, such as the Wasserstein distance. To our best knowledge, this is the first statistical theory of GANs for H{o}lder densities. In comparison with existing works, our theory requires minimum assumptions on data distributions. Our generator and discriminator networks utilize general weight matrices and the non-invertible ReLU activation function, while many existing works only apply to invertible weight matrices and invertible activation functions. In our analysis, we decompose the error into a statistical error and an approximation error by a new oracle inequality, which may be of independent interest.
Computational efficiency is an important consideration for deploying machine learning models for time series prediction in an online setting. Machine learning algorithms adjust model parameters automatically based on the data, but often require users to set additional parameters, known as hyperparameters. Hyperparameters can significantly impact prediction accuracy. Traffic measurements, typically collected online by sensors, are serially correlated. Moreover, the data distribution may change gradually. A typical adaptation strategy is periodically re-tuning the model hyperparameters, at the cost of computational burden. In this work, we present an efficient and principled online hyperparameter optimization algorithm for Kernel Ridge regression applied to traffic prediction problems. In tests with real traffic measurement data, our approach requires as little as one-seventh of the computation time of other tuning methods, while achieving better or similar prediction accuracy.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا