Federated learning is a collaborative machine learning method that uses original data without altering it. Unlike traditional machine learning systems, which need the training data to be centralised into a single machine or data centre, federated learning trains algorithms across several decentralised edge devices or servers.
This learning technique allows mobile phones to build a shared prediction model while retaining the training data on the device and avoiding data storage on the cloud.
Here are some interesting resources that will help you learn about federated learning.
Substra is a federated learning software framework created by a multi-partner research effort centred on the 2016-founded French business Owkin. Substra focuses on the medical industry with data ownership and confidentiality goals. Today, it is utilised for drug discovery in the pharmaceutical sector as part of the MELLODY initiative.
Substra supports several interfaces for various user types. It has a Python library for data scientists, command-line interfaces for administrators, and graphical user interfaces for project managers and other advanced users. Substra’s deployment requires a sophisticated Kubernetes configuration for each node.
PySyft is an open-source Python 3 package that utilises FL, differential privacy, and encrypted computations to enable federated learning for research purposes. It was created by the OpenMined community and primarily works with deep learning frameworks like PyTorch and TensorFlow.
PySyft can perform two kinds of computations:
- Dynamic computations on unobservable data
- Static computations are graphs of calculations that we can do later in a different computing environment.
PySyft is a programming language that defines objects, machine learning methods, and abstractions. You can’t work on simple data science tasks that require network communication with PySyft. It would necessitate the use of another package known as PyGrid. Furthermore, PyGrid supports federated learning on the web, mobile, edge devices, and many terminal types. PyGrid is the API used to manage and scale PySyft. We can use PyGrid Admin to manage it.
Intel Open Federated Learning is an open-source Python 3 project developed by Intel to apply FL to sensitive data. OpenFL contains bash deployment scripts and uses certificates to secure communication, but the user must manage most of this himself.
The library comprises two parts: the collaborator, which trains global models using a local dataset, and the aggregator, which receives model updates and aggregates them to produce the global model. OpenFL includes a Python API as well as a command-line interface. Because communication between nodes is done through mTLS, certificates are necessary. Each node in the federation must be certified. To reduce communication costs, OpenFL enables lossy and lossless data compression. Developers can adjust logging, data split mechanisms and aggregation logic in OpenFL.
The Federated Learning (FL) Plan is the foundation for the OpenFL design philosophy. It’s a YAML file that defines the necessary collaborators, aggregators, connections, models, data, and any basic setup. To isolate federation contexts, OpenFL operates in Docker containers.
IBM Federated Learning provides a foundation for FL on which we can build advanced capabilities. It does not rely on any machine learning framework and supports many learning topologies, such as a standard aggregator and protocols.
It is intended to provide a robust foundation for federated learning, allowing for many learning models, topologies, and learning models in business and hybrid-Cloud scenarios. IBM Federated Learning is compatible with a variety of machine learning models, including:
- Keras, PyTorch, and TensorFlow are used to create models.
- Logistic regression, linear SVM, ridge regression, and other linear classifiers/regressions (with regularizers)
- ID3 Decision Tree
- DQN, DDPG, PPO, and different Deep Reinforcement Learning algorithms
- Bayes’ theorem
NVIDIA CLARA is an application framework built for use cases in healthcare. It contains full-stack GPU-accelerated frameworks, SDKs, and reference applications to help developers, data scientists, and researchers make real-time, secure, and scalable federated learning systems. For example, CLARA is currently being used by the French startup Therapixel, which employs NVIDIA technology to increase the accuracy of a breast cancer diagnosis.
- NVIDIA CLARA is compatible with the following use cases:
- Clara AGX is a medical device clarifier.
- Clara Discovery for Drug Development
- Clara Hospital Guardian
- Clara Imaging specialises in medical imaging.
- Clara Parabricks for Genomic Research