Exploration of Convolutional Neural Networks for deepfake detection, culminating in a custom CNN achieving 75% validation accuracy on a challenging combined dataset of real and AI-generated images.
The proliferation of highly realistic deepfakes poses a significant challenge to digital media integrity. This project aimed to develop an effective deepfake detection model. Initially, the goal was to improve upon the ~70% accuracy of a CNN-based model by Karandikar et al. (2020), which utilized the VGG16 architecture.
The project involved working with datasets like FaceForensics++ (first-generation deepfakes) and Celeb-DF (v2) (more advanced deepfakes). While the VGG16-based approach showed promise (achieving 90% validation on FaceForensics++ alone and 75.69% on a combined dataset), it faced challenges with overfitting and complexity when applied to the harder, combined dataset.
This led to the development of a lighter, custom Convolutional Neural Network (CNN) from scratch. This custom CNN, with approximately 6.5 million parameters, ultimately achieved an 75% validation accuracy on the challenging combined dataset, outperforming the VGG16-based approach on the same data. The project began after an initial exploration of DefakeHop++, which was pivoted away from due to its limited use of deep learning techniques.
The project explored two main CNN-based approaches for deepfake detection after an initial pivot:
Validation Accuracy (Custom CNN on Combined FF++ & Celeb-DF v2)
Validation Accuracy (VGG16-based on Combined FF++ & Celeb-DF v2)
Validation Accuracy (VGG16-based on FaceForensics++ only)
The original Karandikar et al. (2020) paper reported ~70% accuracy with their VGG16-based model. Both models developed in this project surpassed that on comparable tasks.
Several challenges were encountered and overcome during development:
Potential improvements and future directions for the model include: