Originated at Intel, OpenFL is an open source framework for training ML algorithms using the data-private collaborative learning paradigm of FL. OpenFL is designed to be a flexible, extensible and easily learnable tool for data scientists. OpenFL works with training pipelines built with both TensorFlow and PyTorch, and can be easily extended to other ML and deep learning frameworks.
Distributed Mean Estimation (DME)
DME is a central building block in FL, where clients send local gradients to a parameter server for averaging and updating the model. Due to communication constraints, clients often use lossy compression techniques to compress the gradients, resulting in estimation inaccuracies. DME is more challenging when clients have diverse network conditions, such as constrained communication budgets and packet losses. In such settings, DME techniques often incur a significant increase in the estimation error leading to degraded learning performance.
An Overview of EDEN
Motivated by the breakthrough with the compression techniques algorithm DRIVE – “Deterministically Rounding Randomly Rotated Vectors” in 2021, the VMware’s Research Group, in collaboration with Ran Ben Basat (UCL), Amit Portnoy (BGU), Gal Mendelson (Stanford) and Michael Mitzenmacher (Harvard), designed EDEN – “Efficient DME for Diverse Networks” – an algorithm that generalizes to arbitrary bandwidth constraints, heterogeneous clients, and lossy networks.
“We are pleased to contribute to OpenFL, as a part of the Intel-VMware collaboration. We seamlessly integrated EDEN, our new compression pipeline, to OpenFL, due to OpenFL’s well-designed and extensible architecture. We are confident that OpenFL’s community will adopt EDEN and benefit from its unique compression capabilities and system efficiency and look forward to further collaboration.”
Senior Researcher Yaniv Ben-Itzhak and Researcher Shay Vargaftik, OCTO, VMware
“Scalability is a key pillar in the design of OpenFL. VMWare’s contribution of EDEN into OpenFL will have an immediate impact on these goals by broadening the environments the framework can operate while simultaneously reducing network traffic across nodes. Intel is committed to its collaboration with the OCTO Research Group and VMware on EDEN to drive innovation in the increasingly important field of federated learning.”
Prashant Shah – Senior Principal AI Engineer, Intel
Yaniv Ben-Itzhak and Shay Vargaftik from VMware’s OCTO have extensively evaluated EDEN using different scenarios, bandwidth constraints, and network conditions and observed a consistent improvement over the state-of-the-art DME techniques. In one of their experiments, they demonstrated that EDEN achieves a competitive accuracy with the uncompressed baseline using only 0.5 bits per coordinate (a compression ratio of 64) in a federated scenario with 10 participants training a ResNet18 model using the CIFAR100 dataset. No other tested DME techniques converged with such a low-bandwidth constraint.
Both DRIVE and EDEN are promising steps toward a more network-efficient FML. Yaniv and Shay look forward to continued study of compression schemes used in FML systems and devising ways to improve them.
To learn much more of the details behind The Research Group’s achieving this critical milestone in OCTO’s FML work and more, visit their recent OCTO blog, Pushing the Limits of Network Efficiency for Federated Machine Learning.