Orchestrating the Development Lifecycle of Machine Learning-Based IoT Applications: A Taxonomy and Survey

127 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jie Su

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Bin Qian - Jie Su - Zhenyu Wen

النظم الموزعة والتوازية والحوسبة العنقودية التعلم الآلي بنية الشبكات والإنترنت

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Machine Learning (ML) and Internet of Things (IoT) are complementary advances: ML techniques unlock complete potentials of IoT with intelligence, and IoT applications increasingly feed data collected by sensors into ML models, thereby employing results to improve their business processes and services. Hence, orchestrating ML pipelines that encompasses model training and implication involved in holistic development lifecycle of an IoT application often leads to complex system integration. This paper provides a comprehensive and systematic survey on the development lifecycle of ML-based IoT application. We outline core roadmap and taxonomy, and subsequently assess and compare existing standard techniques used in individual stage.

قيم البحث

203 - Tadas Baltruv{s}aitis , Chaitanya Ahuja , Louis-Philippe Morency 2017

Our experience of the world is multimodal - we see objects, hear sounds, feel texture, smell odors, and taste flavors. Modality refers to the way in which something happens or is experienced and a research problem is characterized as multimodal when it includes multiple such modalities. In order for Artificial Intelligence to make progress in understanding the world around us, it needs to be able to interpret such multimodal signals together. Multimodal machine learning aims to build models that can process and relate information from multiple modalities. It is a vibrant multi-disciplinary field of increasing importance and with extraordinary potential. Instead of focusing on specific multimodal applications, this paper surveys the recent advances in multimodal machine learning itself and presents them in a common taxonomy. We go beyond the typical early and late fusion categorization and identify broader challenges that are faced by multimodal machine learning, namely: representation, translation, alignment, fusion, and co-learning. This new taxonomy will enable researchers to better understand the state of the field and identify directions for future research.

التعلم الآلي

Blockchain based Attack Detection on Machine Learning Algorithms for IoT based E-Health Applications

284 - Thippa Reddy Gadekallu , Manoj M K , Sivarama Krishnan S 2020

The application of machine learning (ML) algorithms are massively scaling-up due to rapid digitization and emergence of new tecnologies like Internet of Things (IoT). In todays digital era, we can find ML algorithms being applied in the areas of heal thcare, IoT, engineering, finance and so on. However, all these algorithms need to be trained in order to predict/solve a particular problem. There is high possibility of tampering the training datasets and produce biased results. Hence, in this article, we have proposed blockchain based solution to secure the datasets generated from IoT devices for E-Health applications. The proposed blockchain based solution uses using private cloud to tackle the aforementioned issue. For evaluation, we have developed a system that can be used by dataset owners to secure their data.

التشفير والأمن التعلم الآلي

From Federated to Fog Learning: Distributed Machine Learning over Heterogeneous Wireless Networks

194 - Seyyedali Hosseinalipour , Christopher G. Brinton , Vaneetn Aggarwal 2020

Machine learning (ML) tasks are becoming ubiquitous in todays network applications. Federated learning has emerged recently as a technique for training ML models at the network edge by leveraging processing capabilities across the nodes that collect the data. There are several challenges with employing conventional federated learning in contemporary networks, due to the significant heterogeneity in compute and communication capabilities that exist across devices. To address this, we advocate a new learning paradigm called fog learning which will intelligently distribute ML model training across the continuum of nodes from edge devices to cloud servers. Fog learning enhances federated learning along three major dimensions: network, heterogeneity, and proximity. It considers a multi-layer hybrid learning framework consisting of heterogeneous devices with various proximities. It accounts for the topology structures of the local networks among the heterogeneous nodes at each network layer, orchestrating them for collaborative/cooperative learning through device-to-device (D2D) communications. This migrates from star network topologies used for parameter transfers in federated learning to more distributed topologies at scale. We discuss several open research directions to realizing fog learning.

النظم الموزعة والتوازية والحوسبة العنقودية التعلم الآلي بنية الشبكات والإنترنت

Dataset Lifecycle Framework and its applications in Bioinformatics

380 - Yiannis Gkoufas Technology 2021

Bioinformatics pipelines depend on shared POSIX filesystems for its input, output and intermediate data storage. Containerization makes it more difficult for the workloads to access the shared file systems. In our previous study, we were able to run both ML and non-ML pipelines on Kubeflow successfully. However, the storage solutions were complex and less optimal. This is because there are no established resource types to represent the concept of data source on Kubernetes. More and more applications are running on Kubernetes for batch processing. End users are burdened with configuring and optimising the data access, which is what we have experienced before. In this article, we are introducing a new concept of Dataset and its corresponding resource as a native Kubernetes object. We have leveraged the Dataset Lifecycle Framework which takes care of all the low-level details about data access in Kubernetes pods. Its pluggable architecture is designed for the development of caching, scheduling and governance plugins. Together, they manage the entire lifecycle of the custom resource Dataset. We use Dataset Lifecycle Framework to serve data from object stores to both ML and non-ML pipelines running on Kubeflow. With DLF, we make training data fed into ML models directly without being downloaded to the local disks, which makes the input scalable. We have enhanced the durability of training metadata by storing it into a dataset, which also simplifies the set up of the Tensorboard, separated from the notebook server. For the non-ML pipeline, we have simplified the 1000 Genome Project pipeline with datasets injected into the pipeline dynamically. In addition, our preliminary results indicate that the pluggable caching mechanism can improve the performance significantly.

النظم الموزعة والتوازية والحوسبة العنقودية التقنيات الناشئة

EdgeChain: An Edge-IoT Framework and Prototype Based on Blockchain and Smart Contracts

101 - Jianli Pan , Jianyu Wang , Austin Hester 2018

The emerging Internet of Things (IoT) is facing significant scalability and security challenges. On the one hand, IoT devices are weak and need external assistance. Edge computing provides a promising direction addressing the deficiency of centralize d cloud computing in scaling massive number of devices. On the other hand, IoT devices are also relatively vulnerable facing malicious hackers due to resource constraints. The emerging blockchain and smart contracts technologies bring a series of new security features for IoT and edge computing. In this paper, to address the challenges, we design and prototype an edge-IoT framework named EdgeChain based on blockchain and smart contracts. The core idea is to integrate a permissioned blockchain and the internal currency or coin system to link the edge cloud resource pool with each IoT device account and resource usage, and hence behavior of the IoT devices. EdgeChain uses a credit-based resource management system to control how much resource IoT devices can obtain from edge servers, based on pre-defined rules on priority, application types and past behaviors. Smart contracts are used to enforce the rules and policies to regulate the IoT device behavior in a non-deniable and automated manner. All the IoT activities and transactions are recorded into blockchain for secure data logging and auditing. We implement an EdgeChain prototype and conduct extensive experiments to evaluate the ideas. The results show that while gaining the security benefits of blockchain and smart contracts, the cost of integrating them into EdgeChain is within a reasonable and acceptable range.

النظم الموزعة والتوازية والحوسبة العنقودية التشفير والأمن بنية الشبكات والإنترنت