No Arabic abstract
Modern networks run middleboxes that offer services ranging from network address translation and server load balancing to firewalls, encryption, and compression. In an industry trend known as Network Functions Virtualization (NFV), these middleboxes run as virtual machines on any commodity server, and the switches steer traffic through the relevant chain of services. Network administrators must decide how many middleboxes to run, where to place them, and how to direct traffic through them, based on the traffic load and the server and network capacity. Rather than placing specific kinds of middleboxes on each processing node, we argue that server virtualization allows each server node to host all middlebox functions, and simply vary the fraction of resources devoted to each one. This extra flexibility fundamentally changes the optimization problem the network administrators must solve to a new kind of multi-commodity flow problem, where the traffic flows consume bandwidth on the links as well as processing resources on the nodes. We show that allocating resources to maximize the processed flow can be optimized exactly via a linear programming formulation, and to arbitrary accuracy via an efficient combinatorial algorithm. Our experiments with real traffic and topologies show that a joint optimization of node and link resources leads to an efficient use of bandwidth and processing capacity. We also study a class of design problems that decide where to provide node capacity to best process and route a given set of demands, and demonstrate both approximation algorithms and hardness results for these problems.
In wet-lab experiments, the slime mold Physarum polycephalum has demonstrated its ability to solve shortest path problems and to design efficient networks. For the shortest path problem, a mathematical model for the evolution of the slime is available and it has been shown in computer experiments and through mathematical analysis that the dynamics solves the shortest path problem. In this paper, we introduce a dynamics for the network design problem. We formulate network design as the problem of constructing a network that efficiently supports a multi-commodity flow problem. We investigate the dynamics in computer simulations and analytically. The simulations show that the dynamics is able to construct efficient and elegant networks. In the theoretical part we show that the dynamics minimizes an objective combining the cost of the network and the cost of routing the demands through the network. We also give alternative characterization of the optimum solution.
Motivated by scheduling in Geo-distributed data analysis, we propose a target location problem for multi-commodity flow (LoMuF for short). Given commodities to be sent from their resources, LoMuF aims at locating their targets so that the multi-commodity flow is optimized in some sense. LoMuF is a combination of two fundamental problems, namely, the facility location problem and the network flow problem. We study the hardness and algorithmic issues of the problem in various settings. The findings lie in three aspects. First, a series of NP-hardness and APX-hardness results are obtained, uncovering the inherent difficulty in solving this problem. Second, we propose an approximation algorithm for general undirected networks and an exact algorithm for undirected trees, which naturally induce efficient approximation algorithms on directed networks. Third, we observe separations between directed networks and undirected ones, indicating that imposing direction on edges makes the problem strictly harder. These results show the richness of the problem and pave the way to further studies.
We consider the problem of online scheduling on a single machine in order to minimize weighted flow time. The existing algorithms for this problem (STOC 01, SODA 03, FOCS 18) all require exact knowledge of the processing time of each job. This assumption is crucial, as even a slight perturbation of the processing time would lead to polynomial competitive ratio. However, this assumption very rarely holds in real-life scenarios. In this paper, we present the first algorithm for weighted flow time which do not require exact knowledge of the processing times of jobs. Specifically, we introduce the Scheduling with Predicted Processing Time (SPPT) problem, where the algorithm is given a prediction for the processing time of each job, instead of its real processing time. For the case of a constant factor distortion between the predictions and the real processing time, our algorithms match all the best known competitiveness bounds for weighted flow time -- namely $O(log P), O(log D)$ and $O(log W)$, where $P,D,W$ are the maximum ratios of processing times, densities, and weights, respectively. For larger errors, the competitiveness of our algorithms degrades gracefully.
New optical technologies offer the ability to reconfigure network topologies dynamically, rather than setting them once and for all. This is true in both optical wide area networks (optical WANs) and in datacenters, despite the many differences between these two settings. Because of these new technologies, there has been a surge of both practical and theoretical research on algorithms to take advantage of them. In particular, Jia et al. [INFOCOM 17] designed online scheduling algorithms for dynamically reconfigurable topologies for both the makespan and sum of completion times objectives. In this paper, we work in the same setting but study an objective that is more meaningful in an online setting: the sum of flow times. The flow time of a job is the total amount of time that it spends in the system, which may be considerably smaller than its completion time if it is released late. We provide competitive algorithms for the online setting with speed augmentation, and also give a lower bound proving that speed augmentation is in fact necessary. As a side effect of our techniques, we also improve and generalize the results of Jia et al. on completion times by giving an $O(1)$-competitive algorithm for arbitrary sizes and release times even when nodes have different degree bounds, and moreover allow for the weighted sum of completion times (or flow times).
The performance of distributed and data-centric applications often critically depends on the interconnecting network. Applications are hence modeled as virtual networks, also accounting for resource demands on links. At the heart of provisioning such virtual networks lies the NP-hard Virtual Network Embedding Problem (VNEP): how to jointly map the virtual nodes and links onto a physical substrate network at minimum cost while obeying capacities. This paper studies the VNEP in the light of parameterized complexity. We focus on tree topology substrates, a case often encountered in practice and for which the VNEP remains NP-hard. We provide the first fixed-parameter algorithm for the VNEP with running time $O(3^r (s+r^2))$ for requests and substrates of $r$ and $s$ nodes, respectively. In a computational study our algorithm yields running time improvements in excess of 200x compared to state-of-the-art integer programming approaches. This makes it comparable in speed to the well-established ViNE heuristic while providing optimal solutions. We complement our algorithmic study with hardness results for the VNEP and related problems.