Validating Clustering Frameworks for Electric Load Demand Profiles

71 0 0.0 ( 0 )

Download Cite

Added by Soumyabrata Dev

Publication date 2021

fields Electronic Engineering

and research's language is English

Authors Mayank Jain - Tarek AlSkaif - Soumyabrata Dev

Signal Processing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Large-scale deployment of smart meters has made it possible to collect sufficient and high-resolution data of residential electric demand profiles. Clustering analysis of these profiles is important to further analyze and comment on electricity consumption patterns. Although many clustering techniques have been proposed in the literature over the years, it is often noticed that different techniques fit best for different datasets. To identify the most suitable technique, standard clustering validity indices are often used. These indices focus primarily on the intrinsic characteristics of the clustering results. Moreover, different indices often give conflicting recommendations which can only be clarified with heuristics about the dataset and/or the expected cluster structures -- information that is rarely available in practical situations. This paper presents a novel scheme to validate and compare the clustering results objectively. Additionally, the proposed scheme considers all the steps prior to the clustering algorithm, including the pre-processing and dimensionality reduction steps, in order to provide recommendations over the complete framework. Accordingly, the proposed strategy is shown to provide better, unbiased, and uniform recommendations as compared to the standard Clustering Validity Indices.

rate research

A Clustering Framework for Residential Electric Demand Profiles

63 - Mayank Jain , Tarek AlSkaif , 2021

The availability of residential electric demand profiles data, enabled by the large-scale deployment of smart metering infrastructure, has made it possible to perform more accurate analysis of electricity consumption patterns. This paper analyses the electric demand profiles of individual households located in the city Amsterdam, the Netherlands. A comprehensive clustering framework is defined to classify households based on their electricity consumption pattern. This framework consists of two main steps, namely a dimensionality reduction step of input electricity consumption data, followed by an unsupervised clustering algorithm of the reduced subspace. While any algorithm, which has been used in the literature for the aforementioned clustering task, can be used for the corresponding step, the more important question is to deduce which particular combination of algorithms is the best for a given dataset and a clustering task. This question is addressed in this paper by proposing a novel objective validation strategy, whose recommendations are then cross-verified by performing subjective validation.

Machine Learning Computers and Society

Impact of Load Demand Dataset Characteristics on Clustering Validation Indices

233 - Mayank Jain , Mukta Jain , Tarek AlSkaif 2021

With the inclusion of smart meters, electricity load consumption data can be fetched for individual consumer buildings at high temporal resolutions. Availability of such data has made it possible to study daily load demand profiles of the households. Clustering households based on their demand profiles is one of the primary, yet a key component of such analysis. While many clustering algorithms/frameworks can be deployed to perform clustering, they usually generate very different clusters. In order to identify the best clustering results, various cluster validation indices (CVIs) have been proposed in the literature. However, it has been noticed that different CVIs often recommend different algorithms. This leads to the problem of identifying the most suitable CVI for a given dataset. Responding to the problem, this paper shows how the recommendations of validation indices are influenced by different data characteristics that might be present in a typical residential load demand dataset. Furthermore, the paper identifies the features of data that prefer/prohibit the use of a particular cluster validation index.

Computers and Society

Aggregated functional data model applied on clustering and disaggregation of UK electrical load profiles

154 - Gabriel Franco , Camila P. E. de Souza , Nancy L. Garcia 2021

Understanding electrical energy demand at the consumer level plays an important role in planning the distribution of electrical networks and offering of off-peak tariffs, but observing individual consumption patterns is still expensive. On the other hand, aggregated load curves are normally available at the substation level. The proposed methodology separates substation aggregated loads into estimated mean consumption curves, called typical curves, including information given by explanatory variables. In addition, a model-based clustering approach for substations is proposed based on the similarity of their consumers typical curves and covariance structures. The methodology is applied to a real substation load monitoring dataset from the United Kingdom and tested in eight simulated scenarios.

Applications

FPT Approximation for Fair Minimum-Load Clustering

316 - Sayan Bandyapadhyay , Fedor V. Fomin , Petr A. Golovach 2021

In this paper, we consider the Minimum-Load $k$-Clustering/Facility Location (MLkC) problem where we are given a set $P$ of $n$ points in a metric space that we have to cluster and an integer $k$ that denotes the number of clusters. Additionally, we are given a set $F$ of cluster centers in the same metric space. The goal is to select a set $Csubseteq F$ of $k$ centers and assign each point in $P$ to a center in $C$, such that the maximum load over all centers is minimized. Here the load of a center is the sum of the distances between it and the points assigned to it. Although clustering/facility location problems have a rich literature, the minimum-load objective is not studied substantially, and hence MLkC has remained a poorly understood problem. More interestingly, the problem is notoriously hard even in some special cases including the one in line metrics as shown by Ahmadian et al. [ACM Trans. Algo. 2018]. They also show APX-hardness of the problem in the plane. On the other hand, the best-known approximation factor for MLkC is $O(k)$, even in the plane. In this work, we study a fair version of MLkC inspired by the work of Chierichetti et al. [NeurIPS, 2017], which generalizes MLkC. Here the input points are colored by one of the $ell$ colors denoting the group they belong to. MLkC is the special case with $ell=1$. Considering this problem, we are able to obtain a $3$-approximation in $f(k,ell)cdot n^{O(1)}$ time. Also, our scheme leads to an improved $(1 + epsilon)$-approximation in case of Euclidean norm, and in this case, the running time depends only polynomially on the dimension $d$. Our results imply the same approximations for MLkC with running time $f(k)cdot n^{O(1)}$, achieving the first constant approximations for this problem in general and Euclidean metric spaces.

Computational Geometry Discrete Mathematics Data Structures and Algorithms

Load Balanced Dynamic Resource Allocation for MTC Relay

166 - Yifu Yang , Gang Wu , Weidang Lu 2020

A Load Balancing Relay Algorithm (LBRA) was proposed to solve the unfair spectrum resource allocation in the traditional mobile MTC relay. In order to obtain reasonable use of spectrum resources, and a balanced MTC devices (MTCDs) distribution, spectrum resources are dynamically allocated by MTCDs regrouped on the MTCD to MTC gateway link. Moreover, the system outage probability and transmission capacity are derived when using LBRA. The numerical results show that the proposed algorithm has better performance in transmission capacity and outage probability than the traditional method. LBRA had an increase in transmission capacity of about 0.7dB, and an improvement in outage probability of about 0.8dB with a high MTCD density.

Signal Processing Information Theory Information Theory