No Arabic abstract
WI Fast Stats is the first and only dedicated tool tailored to the WI Fast Plants educational objectives. WI Fast Stats is an integrated animated web page with a collection of R-developed web apps that provide Data Visualization and Data Analysis tools for WI Fast Plants data. WI Fast Stats is a user-friendly easy-to-use interface that will render Data Science accessible to K-16 teachers and students currently using WI Fast Plants lesson plans. Users do not need to have strong programming or mathematical background to use WI Fast Stats as the web apps are simple to use, well documented, and freely available.
Information about the spatiotemporal flow of humans within an urban context has a wide plethora of applications. Currently, although there are many different approaches to collect such data, there lacks a standardized framework to analyze it. The focus of this paper is on the analysis of the data collected through passive Wi-Fi sensing, as such passively collected data can have a wide coverage at low cost. We propose a systematic approach by using unsupervised machine learning methods, namely k-means clustering and hierarchical agglomerative clustering (HAC) to analyze data collected through such a passive Wi-Fi sniffing method. We examine three aspects of clustering of the data, namely by time, by person, and by location, and we present the results obtained by applying our proposed approach on a real-world dataset collected over five months.
Data traffic over cellular networks is exhibiting an ongoing exponential growth, increasing by an order of magnitude every year and has already surpassed voice traffic. This increase in data traffic demand has led to a need for solutions to enhance capacity provision, whereby traffic offloading to Wi-Fi is one means that can enhance realised capacity. Though offloading to Wi-Fi networks has matured over the years, a number of challenges are still being faced by operators to its realization. In this article, we carry out a survey of the practical challenges faced by operators in data traffic offloading to Wi-Fi networks. We also provide recommendations to successfully address these challenges.
Biological data mainly comprises of Deoxyribonucleic acid (DNA) and protein sequences. These are the biomolecules which are present in all cells of human beings. Due to the self-replicating property of DNA, it is a key constitute of genetic material that exist in all breathingcreatures. This biomolecule (DNA) comprehends the genetic material obligatory for the operational and expansion of all personified lives. To save DNA data of single person we require 10CD-ROMs.Moreover, this size is increasing constantly, and more and more sequences are adding in the public databases. This abundant increase in the sequence data arise challenges in the precise information extraction from this data. Since many data analyzing and visualization tools do not support processing of this huge amount of data. To reduce the size of DNA and protein sequence, many scientists introduced various types of sequence compression algorithms such as compress or gzip, Context Tree Weighting (CTW), Lampel Ziv Welch (LZW), arithmetic coding, run-length encoding and substitution method etc. These techniques have sufficiently contributed to minimizing the volume of the biological datasets. On the other hand, traditional compression techniques are also not much suitable for the compression of these types of sequential data. In this paper, we have explored diverse types of techniques for compression of large amounts of DNA Sequence Data. In this paper, the analysis of techniques reveals that efficient techniques not only reduce the size of the sequence but also avoid any information loss. The review of existing studies also shows that compression of a DNA sequence is significant for understanding the critical characteristics of DNA data in addition to improving storage efficiency and data transmission. In addition, the compression of the protein sequence is a challenge for the research community. The major parameters for evaluation of these compression algorithms include compression ratio, running time complexity etc.
We show experimentally that workload-based AP-STA associations can improve system throughput significantly. We present a predictive model that guides optimal resource allocations in dense Wi-Fi networks and achieves 72-77% of the optimal throughput with varying training data set sizes using a 3-day trace of real cable modem traffic.
Large volume of Genomics data is produced on daily basis due to the advancement in sequencing technology. This data is of no value if it is not properly analysed. Different kinds of analytics are required to extract useful information from this raw data. Classification, Prediction, Clustering and Pattern Extraction are useful techniques of data mining. These techniques require appropriate selection of attributes of data for getting accurate results. However, Bioinformatics data is high dimensional, usually having hundreds of attributes. Such large a number of attributes affect the performance of machine learning algorithms used for classification/prediction. So, dimensionality reduction techniques are required to reduce the number of attributes that can be further used for analysis. In this paper, Principal Component Analysis and Factor Analysis are used for dimensionality reduction of Bioinformatics data. These techniques were applied on Leukaemia data set and the number of attributes was reduced from to.