Jump to content

Machine Learning at the Edge: Difference between revisions

From Edge Computing Wiki
Vladn (talk | contribs)
No edit summary
Vladn (talk | contribs)
No edit summary
Line 18: Line 18:


===Optimizing Workload Partitioning===
===Optimizing Workload Partitioning===
The key idea for much of the optimization done in machine learning on edge systems involves fully utilizing the heterogenous devices that are often contained in these systems. As such, it is important to understand the capabilities of each device so as to fully utilize its advantages.  
The key idea for much of the optimization done in machine learning on edge systems involves fully utilizing the heterogenous devices that are often contained in these systems. As such, it is important to understand the capabilities of each device so as to fully utilize its advantages. Devices can very greatly from smartphones with more powerful computational abilities to raspberry pis to sensors. More difficult tasks are offloaded to the powerful devices, while simpler tasks, or models that have been somewhat pretrained can be sent to the smaller devices. In some cases, as in [https://ieeexplore.ieee.org/abstract/document/8690980 Mobile-Edge], the task may be dropped altogether if the resources are deemed insufficient. In this way, exceptionally difficult tasks do not block tasks that have the ability to be executed and therefore the system can continue working.


=='''References'''==
=='''References'''==
[https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10818760&tag=1 M. Zhang, X. Shen, J. Cao, Z. Cui and S. Jiang, "EdgeShard: Efficient LLM Inference via Collaborative Edge Computing," in IEEE Internet of Things Journal, doi: 10.1109/JIOT.2024.3524255]
[https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10818760&tag=1 M. Zhang, X. Shen, J. Cao, Z. Cui and S. Jiang, "EdgeShard: Efficient LLM Inference via Collaborative Edge Computing," in IEEE Internet of Things Journal, doi: 10.1109/JIOT.2024.3524255]
[https://ieeexplore.ieee.org/abstract/document/8690980]

Revision as of 20:40, 5 April 2025

Machine Learning at the Edge

4.1 Overview of ML at the Edge

4.2 ML Training at the Edge

4.3 ML Model Optimization at the Edge

The Need for Model Optimization at the Edge

Given the constrained resources, along with the inherently dynamic environment that edge devices must function in, model optimization is a crucial part of machine learning in edge computing. The current most widely used methodology consists of simply specifying an exceptionally large set of parameters, and giving it to the model to train on. This can be feasible when hardware is very advanced and powerful, and is necessary for systems such as Large Language Models (LLMs). However, this is no longer viable when dealing with the devices and environments at the edge. It is crucial to identify the best parameters and training methodology so as to minimize the amount of work done by these devices, while compromising as little as possible on the accuracy of the models. There are multiple ways to this, and they include either optimization or augmentation of the dataset itself, or optimization of the partition of work among the edge devices.

Edge and Cloud Collaboration

One methodology that is often used involves collaboration between both Edge and Cloud Devices. The cloud has the ability to process workloads that may require much more resources and cannot be done on edge devices. On the other hand, edge devices, which can store and process data locally, may have lower latency and more privacy. Given the advantages of each of these, many have proposed that the best way to handle machine learning is through a combination of edge and cloud computing.

The primary issue facing this computing paradigm, however, is the problem of optimally selecting which workloads should be done on the cloud and which should be done on the edge. This is a crucial problem to solve, as the correct partition of workloads is the best way to ensure that the respective benefits of the devices can be leveraged. A common way to do this, is to run certain computing tasks on the necessary devices and determine the length of time and resources that it takes. An example of this is the profiling step done in EdgeShard. Other frameworks implement similar steps, where the capabilities of different devices are tested in order to allocate their workloads and determine the limit at which they can provide efficient functionality. If the workload is beyond the limits of the devices, it can be sent to the cloud for processing

The key advantage of this is that it is able to utilize the resources of the edge devices as necessary, allowing increased data privacy and lower latency. Since workloads are only processed in the cloud as needed, this will reduce the overall amount of time needed for processing because data is not constantly sent back and forth. It also allows for much less network congestion, which is crucial for many applications.

Optimizing Workload Partitioning

The key idea for much of the optimization done in machine learning on edge systems involves fully utilizing the heterogenous devices that are often contained in these systems. As such, it is important to understand the capabilities of each device so as to fully utilize its advantages. Devices can very greatly from smartphones with more powerful computational abilities to raspberry pis to sensors. More difficult tasks are offloaded to the powerful devices, while simpler tasks, or models that have been somewhat pretrained can be sent to the smaller devices. In some cases, as in Mobile-Edge, the task may be dropped altogether if the resources are deemed insufficient. In this way, exceptionally difficult tasks do not block tasks that have the ability to be executed and therefore the system can continue working.

References

M. Zhang, X. Shen, J. Cao, Z. Cui and S. Jiang, "EdgeShard: Efficient LLM Inference via Collaborative Edge Computing," in IEEE Internet of Things Journal, doi: 10.1109/JIOT.2024.3524255 [1]