New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Energy-Harvesting Distributed Machine Learning

73 0 0.0 ( 0 )

Download Cite

Added by Basak Guler

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Basak Guler - Aylin Yener

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper provides a first study of utilizing energy harvesting for sustainable machine learning in distributed networks. We consider a distributed learning setup in which a machine learning model is trained over a large number of devices that can harvest energy from the ambient environment, and develop a practical learning framework with theoretical convergence guarantees. We demonstrate through numerical experiments that the proposed framework can significantly outperform energy-agnostic benchmarks. Our framework is scalable, requires only local estimation of the energy statistics, and can be applied to a wide range of distributed training settings, including machine learning in wireless networks, edge computing, and mobile internet of things.

rate research

On Tilted Losses in Machine Learning: Theory and Applications

76 - Tian Li , Ahmad Beirami , Maziar Sanjabi 2021

Exponential tilting is a technique commonly used in fields such as statistics, probability, information theory, and optimization to create parametric distribution shifts. Despite its prevalence in related fields, tilting has not seen widespread use in machine learning. In this work, we aim to bridge this gap by exploring the use of tilting in risk minimization. We study a simple extension to ERM -- tilted empirical risk minimization (TERM) -- which uses exponential tilting to flexibly tune the impact of individual losses. The resulting framework has several useful properties: We show that TERM can increase or decrease the influence of outliers, respectively, to enable fairness or robustness; has variance-reduction properties that can benefit generalization; and can be viewed as a smooth approximation to a superquantile method. Our work makes rigorous connections between TERM and related objectives, such as Value-at-Risk, Conditional Value-at-Risk, and distributionally robust optimization (DRO). We develop batch and stochastic first-order optimization methods for solving TERM, provide convergence guarantees for the solvers, and show that the framework can be efficiently solved relative to common alternatives. Finally, we demonstrate that TERM can be used for a multitude of applications in machine learning, such as enforcing fairness between subgroups, mitigating the effect of outliers, and handling class imbalance. Despite the straightforward modification TERM makes to traditional ERM objectives, we find that the framework can consistently outperform ERM and deliver competitive performance with state-of-the-art, problem-specific approaches.

Machine Learning Information Theory Information Theory

Feature selection in machine learning: Renyi min-entropy vs Shannon entropy

131 - Catuscia Palamidessi , Marco Romanelli 2020

Feature selection, in the context of machine learning, is the process of separating the highly predictive feature from those that might be irrelevant or redundant. Information theory has been recognized as a useful concept for this task, as the prediction power stems from the correlation, i.e., the mutual information, between features and labels. Many algorithms for feature selection in the literature have adopted the Shannon-entropy-based mutual information. In this paper, we explore the possibility of using Renyi min-entropy instead. In particular, we propose an algorithm based on a notion of conditional Renyi min-entropy that has been recently adopted in the field of security and privacy, and which is strictly related to the Bayes error. We prove that in general the two approaches are incomparable, in the sense that we show that we can construct datasets on which the Renyi-based algorithm performs better than the corresponding Shannon-based one, and datasets on which the situation is reversed. In practice, however, when considering datasets of real data, it seems that the Renyi-based algorithm tends to outperform the other one. We have effectuate several experiments on the BASEHOCK, SEMEION, and GISETTE datasets, and in all of them we have indeed observed that the Renyi-based algorithm gives better results.

Machine Learning Information Theory Information Theory

Speeding Up Distributed Machine Learning Using Codes

95 - Kangwook Lee , Maximilian Lam , Ramtin Pedarsani 2015

Codes are widely used in many engineering applications to offer robustness against noise. In large-scale systems there are several types of noise that can affect the performance of distributed machine learning algorithms -- straggler nodes, system failures, or communication bottlenecks -- but there has been little interaction cutting across codes, machine learning, and distributed systems. In this work, we provide theoretical insights on how coded solutions can achieve significant gains compared to uncoded ones. We focus on two of the most basic building blocks of distributed learning algorithms: matrix multiplication and data shuffling. For matrix multiplication, we use codes to alleviate the effect of stragglers, and show that if the number of homogeneous workers is $n$, and the runtime of each subtask has an exponential tail, coded computation can speed up distributed matrix multiplication by a factor of $log n$. For data shuffling, we use codes to reduce communication bottlenecks, exploiting the excess in storage. We show that when a constant fraction $alpha$ of the data matrix can be cached at each worker, and $n$ is the number of workers, emph{coded shuffling} reduces the communication cost by a factor of $(alpha + frac{1}{n})gamma(n)$ compared to uncoded shuffling, where $gamma(n)$ is the ratio of the cost of unicasting $n$ messages to $n$ users to multicasting a common message (of the same size) to $n$ users. For instance, $gamma(n) simeq n$ if multicasting a message to $n$ users is as cheap as unicasting a message to one user. We also provide experiment results, corroborating our theoretical gains of the coded algorithms.

Distributed Parallel and Cluster Computing Information Theory Machine Learning

Cooperative Spectrum Sharing Relaying Protocols With Energy Harvesting Cognitive User

99 - Tarun Kalluri , Mansi Peer , Vivek Ashok Bohara 2015

The theory of wireless information and power transfer in energy constrained wireless networks has caught the interest of researchers due to its potential in increasing the lifetime of sensor nodes and mitigate the environment hazards caused by conventional cell batteries. Similarly, the advancements in areas of cooperative spectrum sharing protocols has enabled efficient use of frequency spectrum between a licensed primary user and a secondary user. In this paper, we consider an energy constrained secondary user which harvests energy from the primary signal and relays the primary signal in exchange for the spectrum access. We consider Nakagami-m fading model and propose two key protocols, namely time-splitting cooperative spectrum sharing (TS-CSS) and power-sharing cooperative spectrum sharing (PS-CSS), and derive expressions for the outage probabilities of the primary and secondary user in decode-forward and amplify-forward relaying modes. From the obtained results, it has been shown that the secondary user can carry its own transmission without adversely affecting the performance of the primary user and that PS-CSS protocol outperforms the TS-PSS protocol in terms of outage probability over a wide range of Signal to noise ratio(SNRs). The effect of various system parameters on the outage performance of these protocols have also been studied.

Networking and Internet Architecture Information Theory Information Theory

Secrecy Limits of Energy Harvesting IoT Networks under Channel Imperfections

75 - Furqan Jameel , Zheng Chang , Riku Jantti 2020

Simultaneous wireless information and power transfer (SWIPT) has recently gathered much research interest from both academia and industry as a key enabler of energy harvesting Internet-of-things (IoT) networks. Due to a number of growing use cases of such networks, it is important to study their performance limits from the perspective of physical layer security (PLS). With this intent, this work aims to provide a novel analysis of the ergodic secrecy capacity of a SWIPT system is provided for Rician and Nakagami-m faded communication links. For a realistic evaluation of the system, the imperfections of channel estimations for different receiver designs of the SWIPT-based IoT systems have been taken into account. Subsequently, the closedform expressions of the ergodic secrecy capacities for the considered scenario are provided and, then, validated through extensive simulations. The results indicate that an error ceiling appears due to imperfect channel estimation at high values of signal-to-noise ratio (SNR). More importantly, the secrecy capacity under different channel conditions stops increasing beyond a certain limit, despite an increase of the main link SNR. The in-depth analysis of secrecy-energy trade-off has also been performed and a comparison has been provided for imperfect and perfect channel estimation cases. As part of the continuous evolution of IoT networks, the results provided in this work can help in identifying the secrecy limits of IoT networks in the presence of multiple eavesdroppers.

Signal Processing Information Theory Information Theory

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Energy-Harvesting Distributed Machine Learning

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions