While Facebook, Microsoft, and many others are banding together to help make machine learning capable of detecting deepfakes in videos, we at Signzy are trying to solve a similar problem, detecting fakes in documents. In the journey of building the global digital trust system, we at Signzy had to solve this major challenge of detecting image manipulations in identity documents.
Fig 1.0 Example of our forgery detection in action
In this blog, I will try to explain our approach in building an innovative image manipulation detection approach using deep learning.
The above images are examples of the advancements in image manipulation techniques. It takes a considerable effort for a human to find out that the image is forged. The features which distinguish real and fake are less, which makes it difficult to detect with human eyes.
Our objective was to build a system which could detect image manipulated documents.
Our first step was to create a dataset of forged documents to test the algorithm. With our expertise and domain knowledge in this field we came up with various scenarios on how an intruder would forge a document. The corresponding data for these scenarios was prepared by photoshop experts.
The forged documents were of mostly two categories.
- Copy paste : A region of the image copied from a particular document and pasted into a different document.
- Copy move : A region of the image copied from a particular document and pasted into the same document.
This is the type of forgery when a fraudster tries to copy a face from one document into another document. Our goal was to detect these forged regions and to classify the document as fake or real.
The dataset that we created manually using photoshop experts was not enough to train any deep learning solution around it. So we developed image processing algorithms which could generate synthetic forged data. Now all set for the experimentation.
For forged region detection, our approach was to first start off with the state of the object detection methods. We tried with FRCNN to predict the bounding boxes of the forged region along with the class information. FRCNN uses convolution nets to extract feature maps from the input image. These maps are then passed on to a Region Proposal Network which will give proposals for bounding boxes. These proposals are passed on to the ROI pooling layer which converts all the proposals to the same size. Finally, they are passed on to a fully connected layer to predict bounding boxes and classes. This method did not give us better results because the forged regions were of very small size.
Our second approach was to train a patch-based classifier which could classify between real and forged patches. The idea was on the assumption that if the copied image region has a different compression footprint when compared to the region to which its copied to, there would be a strong shift in the way that the pixels are grouped. This method proved to be very efficient.
It almost gave us around 97% accuracy. We did a lot of ablation studies to find the right configurations which I can’t reveal due to IP issues.
This is the type of forgery when a fraudster tries to change any text in an image by copying a similar text from the same image. For example, changing dates. Our goal was to detect these forged regions and to classify the document as fake or real.
There is a lot of literature related to detection of this type of forgery. The popular one is DCT based feature matching. In this method, DCT followed by quantization is performed on a 16×16 patch extracted from the image. The similar operation is performed throughout the entire image and all the matrices are sorted. Then for each row in the matrix the corresponding shift vector is calculated. If two regions are copied the shift vector of those regions would match. A very powerful algorithm that works well in most scenarios. But in our use case, since a document has many regions that have the same DCT values this method couldn’t be applied.
Our method involved two parallel networks. First, an encoder-decoder network predicts pixel-wise forged regions. A second network runs in parallel that finds feature maps which are in correlation with forged region predicted by the first network. Both networks are trained together with a cumulative loss function. I regret as I can’t reveal the full solution due to IP issues.
To summarize this blog, I had explained the two major types of forgeries which can be done in documents. Also, I had tried to explain the approaches we took to solve this challenging problem. Hope you had a nice read.
Signzy is an AI-powered RPA platform for financial services. No matter how complex your workflow or operational complexity, Signzy is able to completely automate your back-operations decision-making process into a real-time API. This is possible due to a combination of Nebula — Our no-code AI model builder and our Fintech API Marketplace of over 200+ APIs. Today we work with over 90+ FIs globally including the 4 largest banks in India and a Top 3 acquiring Bank in US. Globally we have a strong partnership with MasterCard and offices in New York and Dubai to serve our customers in the 2 geographies. Our Product team of 120+ people is building a global AI product out of Bangalore.
Visit www.signzy.com for more information about us.
You can reach out to our team at email@example.com
Reach out to our team: firstname.lastname@example.org
For sales queries: Swati Saxena
Email : email@example.com
Author: A B Sarvanan
Tech Lead — AI Team (Signzy)