Editing Federated Learning (section)

= 5.2 Federated Learning Architectures =

Federated Learning (FL) can be implemented through various architectural configurations, each defining how clients interact, how updates are aggregated, and how trust and responsibility are distributed. These architectures play a central role in determining the scalability, fault tolerance, communication overhead, and privacy guarantees of a federated system. In edge computing environments, where client devices are heterogeneous and network reliability varies, the choice of architecture significantly affects the efficiency and robustness of learning. The three dominant paradigms are centralized, decentralized, and hierarchical architectures. Each of these approaches balances different trade-offs in terms of coordination complexity, system resilience, and resource allocation.

[[File:Architecture.png|thumb|center|800px|Visual comparison of Cloud-Based, Edge-Based, and Hierarchical Federated Learning architectures. Source: <sup>[1]</sup>]]

=== 5.2.1 Centralized Architecture ===

In the centralized FL architecture, a central server or cloud orchestrator is responsible for all coordination, aggregation, and distribution activities. The server begins each round by broadcasting a global model to a selected subset of client devices, which then perform local training using their private data. After completing local updates, clients send their modified model parameters—usually in the form of weight vectors or gradients—back to the server. The server performs aggregation, typically using algorithms such as Federated Averaging (FedAvg), and sends the updated global model to the clients for the next round of training.

The centralized model is appealing for its simplicity and compatibility with existing cloud-to-client infrastructures. It is relatively easy to deploy, manage, and scale in environments with stable connectivity and limited client churn. However, its reliance on a single server introduces critical vulnerabilities. The server becomes a bottleneck under high communication loads and a single point of failure if it experiences downtime or compromise. Furthermore, this architecture requires clients to trust the central aggregator with metadata, model parameters, and access scheduling. In privacy-sensitive or high-availability contexts, these limitations can restrict centralized FL’s applicability.<sup>[1]</sup>

=== 5.2.2 Decentralized Architecture ===

Decentralized FL removes the need for a central server altogether. Instead, client devices interact directly with each other to share and aggregate model updates. These peer-to-peer (P2P) networks may operate using structured overlays, such as ring topologies or blockchain systems, or employ gossip-based protocols for stochastic update dissemination. In some implementations, clients collaboratively compute weighted averages or perform federated consensus to update the global model in a distributed fashion.

This architecture significantly enhances system robustness, resilience, and trust decentralization. There is no single point of failure, and the absence of a central coordinator eliminates risks of aggregator bias or compromise. Moreover, decentralized FL supports federated learning in contexts where participants belong to different organizations or jurisdictions and cannot rely on a neutral third party. However, these benefits come at the cost of increased communication overhead, complex synchronization requirements, and difficulties in managing convergence—especially under non-identical data distributions and asynchronous updates. Protocols for secure communication, update verification, and identity authentication are necessary to prevent malicious behavior and ensure model integrity. Due to these complexities, decentralized FL is an active area of research and is best suited for scenarios requiring strong autonomy and fault tolerance.<sup>[2]</sup>

=== 5.2.3 Hierarchical Architecture ===

Hierarchical FL is a hybrid architecture that introduces one or more intermediary layers—often called edge servers or aggregators—between clients and the global coordinator. In this model, clients are organized into logical or geographical groups, with each group connected to an edge server. Clients send their local model updates to their respective edge aggregator, which performs preliminary aggregation. The edge servers then send their aggregated results to the cloud server, where final aggregation occurs to produce the updated global model.

This multi-tiered architecture is designed to address the scalability and efficiency challenges inherent in centralized systems while avoiding the coordination overhead of full decentralization. Hierarchical FL is especially well-suited for edge computing environments where data, clients, and compute resources are distributed across structured clusters, such as hospitals within a healthcare network or base stations in a telecommunications infrastructure.

One of the key advantages of hierarchical FL is communication optimization. By aggregating locally at edge nodes, the amount of data transmitted over wide-area networks is significantly reduced. Additionally, this model supports region-specific model personalization by allowing edge servers to maintain specialized sub-models adapted to local client behavior. Hierarchical FL also enables asynchronous and fault-tolerant training by isolating disruptions within specific clusters. However, this architecture still depends on reliable edge aggregators and introduces new challenges in cross-layer consistency, scheduling, and privacy preservation across multiple tiers.<sup>[1]</sup><sup>[3]</sup>