DCAF: A Dynamic Computation Allocation Framework for Online Serving System

291 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Guorui Zhou

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Biye Jiang - Pengye Zhang - Rihan Chen

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Modern large-scale systems such as recommender system and online advertising system are built upon computation-intensive infrastructure. The typical objective in these applications is to maximize the total revenue, e.g. GMV~(Gross Merchandise Volume), under a limited computation resource. Usually, the online serving system follows a multi-stage cascade architecture, which consists of several stages including retrieval, pre-ranking, ranking, etc. These stages usually allocate resource manually with specific computing power budgets, which requires the serving configuration to adapt accordingly. As a result, the existing system easily falls into suboptimal solutions with respect to maximizing the total revenue. The limitation is due to the face that, although the value of traffic requests vary greatly, online serving system still spends equal computing power among them. In this paper, we introduce a novel idea that online serving system could treat each traffic request differently and allocate personalized computation resource based on its value. We formulate this resource allocation problem as a knapsack problem and propose a Dynamic Computation Allocation Framework~(DCAF). Under some general assumptions, DCAF can theoretically guarantee that the system can maximize the total revenue within given computation budget. DCAF brings significant improvement and has been deployed in the display advertising system of Taobao for serving the main traffic. With DCAF, we are able to maintain the same business performance with 20% computation resource reduction.

قيم البحث

78 - Lin Cheng , Bernardo A. Huberman 2019

We propose and experimentally demonstrate a bandwidth allocation method based on the comparative advantage of spectral efficiency among users in a multi-tone small-cell radio access system with frequency-selective fading channels. The method allocate s frequency resources by ranking the comparative advantage of the spectrum measured at the receivers ends. It improves the overall spectral efficiency of the access system with low implementation complexity and independently of power loading. In a two-user wireless transmission experiment, we observe up to 23.1% average capacity improvement by using the proposed method.

بنية الشبكات والإنترنت نظرية المعلومات معالجة الإشارات

Optimal and Low-Complexity Dynamic Spectrum Access for RF-Powered Ambient Backscatter System with Online Reinforcement Learning

75 - Nguyen Van Huynh , Dinh Thai Hoang , Diep N. Nguyen 2018

Ambient backscatter has been introduced with a wide range of applications for low power wireless communications. In this article, we propose an optimal and low-complexity dynamic spectrum access framework for RF-powered ambient backscatter system. In this system, the secondary transmitter not only harvests energy from ambient signals (from incumbent users), but also backscatters these signals to its receiver for data transmission. Under the dynamics of the ambient signals, we first adopt the Markov decision process (MDP) framework to obtain the optimal policy for the secondary transmitter, aiming to maximize the system throughput. However, the MDP-based optimization requires complete knowledge of environment parameters, e.g., the probability of a channel to be idle and the probability of a successful packet transmission, that may not be practical to obtain. To cope with such incomplete knowledge of the environment, we develop a low-complexity online reinforcement learning algorithm that allows the secondary transmitter to learn from its decisions and then attain the optimal policy. Simulation results show that the proposed learning algorithm not only efficiently deals with the dynamics of the environment, but also improves the average throughput up to 50% and reduces the blocking probability and delay up to 80% compared with conventional methods.

بنية الشبكات والإنترنت الذكاء الاصطناعي

Optimal Posted Prices for Online Cloud Resource Allocation

332 - Zijun Zhang , Zongpeng Li , Chuan Wu 2017

We study online resource allocation in a cloud computing platform, through a posted pricing mechanism: The cloud provider publishes a unit price for each resource type, which may vary over time; upon arrival at the cloud system, a cloud user either t akes the current prices, renting resources to execute its job, or refuses the prices without running its job there. We design pricing functions based on the current resource utilization ratios, in a wide array of demand-supply relationships and resource occupation durations, and prove worst-case competitive ratios of the pricing functions in terms of social welfare. In the basic case of a single-type, non-recycled resource (i.e., allocated resources are not later released for reuse), we prove that our pricing function design is optimal, in that any other pricing function can only lead to a worse competitive ratio. Insights obtained from the basic cases are then used to generalize the pricing functions to more realistic cloud systems with multiple types of resources, where a job occupies allocated resources for a number of time slots till completion, upon which time the resources are returned back to the cloud resource pool.

بنية الشبكات والإنترنت علوم الكمبيوتر ونظرية الألعاب

A Unified Framework for Marketing Budget Allocation

63 - Kui Zhao , Junhao Hua , Ling Yan 2019

While marketing budget allocation has been studied for decades in traditional business, nowadays online business brings much more challenges due to the dynamic environment and complex decision-making process. In this paper, we present a novel unified framework for marketing budget allocation. By leveraging abundant data, the proposed data-driven approach can help us to overcome the challenges and make more informed decisions. In our approach, a semi-black-box model is built to forecast the dynamic market response and an efficient optimization method is proposed to solve the complex allocation task. First, the response in each market-segment is forecasted by exploring historical data through a semi-black-box model, where the capability of logit demand curve is enhanced by neural networks. The response model reveals relationship between sales and marketing cost. Based on the learned model, budget allocation is then formulated as an optimization problem, and we design efficient algorithms to solve it in both continuous and discrete settings. Several kinds of business constraints are supported in one unified optimization paradigm, including cost upper bound, profit lower bound, or ROI lower bound. The proposed framework is easy to implement and readily to handle large-scale problems. It has been successfully applied to many scenarios in Alibaba Group. The results of both offline experiments and online A/B testing demonstrate its effectiveness.

بنى وهياكل البيانات والخوارزميات الذكاء الاصطناعي

Uni-FedRec: A Unified Privacy-Preserving News Recommendation Framework for Model Training and Online Serving

99 - Tao Qi , Fangzhao Wu , Chuhan Wu 2021

News recommendation is important for personalized online news services. Most existing news recommendation methods rely on centrally stored user behavior data to both train models offline and provide online recommendation services. However, user data is usually highly privacy-sensitive, and centrally storing them may raise privacy concerns and risks. In this paper, we propose a unified news recommendation framework, which can utilize user data locally stored in user clients to train models and serve users in a privacy-preserving way. Following a widely used paradigm in real-world recommender systems, our framework contains two stages. The first one is for candidate news generation (i.e., recall) and the second one is for candidate news ranking (i.e., ranking). At the recall stage, each client locally learns multiple interest representations from clicked news to comprehensively model user interests. These representations are uploaded to the server to recall candidate news from a large news pool, which are further distributed to the user client at the ranking stage for personalized news display. In addition, we propose an interest decomposer-aggregator method with perturbation noise to better protect private user information encoded in user interest representations. Besides, we collaboratively train both recall and ranking models on the data decentralized in a large number of user clients in a privacy-preserving way. Experiments on two real-world news datasets show that our method can outperform baseline methods and effectively protect user privacy.

استرجاع المعلومات