Machine learning models often retain learned information even after the data used to train them is deleted. This can lead to privacy concerns and regulatory compliance issues, especially when handling sensitive information. Machine unlearning addresses this challenge by enabling models to "forget" specific data without retraining from scratch, ensuring privacy and compliance while maintaining model performance.
The aim of this project is to develop an advanced machine unlearning framework that enhances the efficiency and accuracy of removing specific data from trained models. By integrating a noise-induced impairment model with a student-teacher framework and incorporating curriculum learning,this project shows significant improvements in both forget accuracy and retain accuracy.
- Download the repository and upload it to a drive folder.
- Navigate to the root directory containing the
Main.ipynb
file. - In the second cell of the IPYNB file, change the folder path to the path of your folder on your drive.
- Run the cells one by one; results will be produced accordingly.
Results will be saved in CSV files as specified in the scripts (look for filenames ending with -out
).
- dataset.py: Contains the code for downloading and arranging the data in the required format.
- metrics.py: Contains the code for evaluation metrics.
- model.py: Contains the code for the model architecture.
- unlearn.py: Contains the code for unlearning algorithms.
- utils.py: Contains the code for training utility functions.
- Download the data.
- Split the data into forget and retain sets as required.
- Train the original model on all the data.
- Evaluate the performance of the original model.
- Apply unlearning algorithms to the model.
- Evaluate the performance of the unlearned model.
- Forget Accuracy: Improved by 3%
- Retain Accuracy: Enhanced by 5%