Vertical Federated Learning concept

Marc Deveaux

5 min readJun 26, 2022

Notes on Vertical Federated Learning

Sources

“Vertical Federated Learning: Challenges, Methodologies and Experiments” https://arxiv.org/abs/2202.04309
“Federated Machine Learning: Concept and Applications, ACM Transaction” https://arxiv.org/abs/1902.04885
https://towardsdatascience.com/introduction-to-federated-learning-and-challenges-ea7e02f260ca
https://en.wikipedia.org/wiki/Federated_learning
https://www.explainthatstuff.com/introduction-to-neural-networks.html

1. Vertical federated learning overview

Introduction

Today’s AI still faces two major challenges:

One is that in most industries, data exists in the form of isolated islands
The other is the strengthening of data privacy and security (example: GDPR, etc.)

One answer to those challenges is Federated Learning

Federated learning is a machine learning technique that trains an algorithm across multiple decentralized servers holding local data samples, without exchanging them
This approach stands in contrast to traditional centralized machine learning techniques where all the local datasets are uploaded to one server
Federated learning enables multiple actors to build a common machine learning model without sharing data

Vertical Federated Learning Applicability

There are different type of federated learning techniques, however in this post we will talk about the “Vertical Federated Learning” technique. Vertical federated learning is applicable to the cases that two data sets share the same sample ID space but differ in feature space

Same easy_id across two data sets

a user named Hector is in both in dataset A and B -> good
a user named Martha is only in dataset B -> bad

Different features

dataset A has columns “Age” and “Gender” while dataset B has “personal income” -> good
dataset A and B have the same columns “Age” and “Gender” -> bad

Example

Consider two different companies in the same city, one is a bank, and the other is an e-commerce company. Their user lists are likely to contain most of the residents of the area, so the intersection of their user space is large (and they have many users in common)
Bank records the user’s revenue and expenditure behavior, and the e-commerce retains the user’s browsing and purchasing history, so their feature spaces are very different (so features will be different)
We want both parties to have a prediction model for product purchase based on user and product information. By exploiting 2 different datasets, I have more features to build a better learning model
Vertically federated learning is the process of aggregating these different features and building a model in a privacy-preserving manner, using data from both parties collaboratively

Architecture

Part 1 Encrypted entity alignment: Since the user groups of the two companies are not the same, the system uses the encryption-based user ID alignment techniques to confirm the common users of both parties without A and B exposing their respective data
Part 2 Encrypted model training: Train the machine learning model on the common user list. In this process, only the model parameters are shared, never the data

2. Reminder on Fully Connected Layer Neural Network

To understand the step by step idea behind the Encrypted model training, we need to recall the basics of fully connected layers NN.

Layers and weight

Input feature: information that the network will attempt to learn about

Hidden units: where the information gets processed. The system becomes more “knowledgeable” as it goes along, filtering information through multiple hidden layers

Output layer: self explanatory

Weights

All units are interconnected (represented by arrows)
The connections between one unit and another are represented by a number called a weight
It can be either positive or negative
The higher the weight, the more influence one unit has on another

Information Flow

Basically:

calculate output
compare the calculated output against the real output
update the weights
repeat

Forward propagation

Goal: calculate the output probability

When a NN is being trained, information are fed into the network via the input units, which trigger the layers of hidden units, which arrive at the output units
Each unit receives inputs from the units to its left, and the inputs are multiplied by the weights of the connections they travel along
Every unit adds up all the inputs it receives in this way and if the sum is more than a certain threshold value, the unit “fires” and triggers the units it’s connected to (those on its right)

Backward propagation

Goal: update the weights

The NN learns through feedback (being told if what it is doing is right or wrong)
NN learns to compare the output it produced with the output it was meant to produce, and uses the difference between them to modify the weights of the connections between the units in the network
The NN change the weights from the output units through the hidden units to the input units by going backward
In time, backpropagation causes the network to learn, reducing the difference between actual and intended output to the point where the two exactly coincide

3. Vertical Federated Learning Step by Step

see for details: https://arxiv.org/pdf/2202.04309.pdf

Select the common users across all the different organization
Each local model do a Forward propagation using its local data. No data or weights are shared across organizations
Each local model transmit its forward output to the label owner. Forward outputs contain intermediate results of the local NN
Top model do forward propagation. The top model connects all the local intermediates NN results and create the final output
Top model do backward propagation. Parameters are getting updated
Backward output transmission. Gradients are sent back to each local models
Local Model Backward Propagation. The local model parameters are updated