India’s fastest growing Identity Verification company Signzy partners with Apriori for data aggregation services.

About Signzy

Signzy is a market-leading platform redefining the speed, accuracy, and experience of how financial institutions are onboarding customers and businesses – using the digital medium. The company’s award-winning no-code GO platform delivers seamless, end-to-end, and multi-channel onboarding journeys while offering customizable workflows. In addition, it gives these players access to an aggregated marketplace of 240+ bespoke APIs that can be easily added to any workflow with simple widgets.

Signzy is enabling ten million+ end customer and business onboarding every month at a success rate of 99% while reducing the speed to market from 6 months to 3-4 weeks. It works with over 240+ FIs globally, including the 4 largest banks in India, a Top 3 acquiring Bank in the US, and has a robust global partnership with Mastercard and Microsoft. The company’s product team is based out of Bengaluru and has a strong presence in Mumbai, New York, and Dubai.

Visit www.signzy.com for more information about us.

You can reach out to our team at reachout@signzy.com

Written By:

Paritosh Vatsal Tripathi

 

Removing blur from images

Everyone misses perfect shots once in a while. Yeah, that’s a pretty shame (We all do that all the time!!!).

There are special moments which we want to capture to make them memorable for lifetime, but just because your camera shook or amount of noise in your camera can really hamper those special moment resulting in blurred images (Maybe your subject is on the move, the reason is not always bad cameras but bad timing as well!!!).

So, if you are also one of us who misses out their special moment, this post is just for you. In this post, you will get to know how you can restore blurred images. All the thanks and applause goes to Neural networks.

What are you going to learn?

From this blog post, you will learn how to make use of the neural network by image deblurring technique with the help of Scale-recurrent Networks. For more info on the technique, you can access this link. The network takes sequence of blurry images as input at different scales and produces a finite set of sharp images. The final output image is at the full resolution.

Figure 1: SRN architecture from the original paper

The method above uses end-to-end trainable networks for the images. Then it used multi-scale convolutional neural network with the approach of state-of-art.

These methods embark from an abrasive measure of the blurry images, and gradually try to recover the suppressed image at higher resolutions.

This Simple Recurrent Network aka SRN makes use of scale recurrent network for multi-scale deblurring. The solver and the corresponding parameter at each scale in a well-established multi-scale method are always same. This is a natural choice as it simply aims to solve the very same problem. If we vary scales in different parameters, then it may cause instability and the extra issue of unrestrictive solution space. Another concern to address here is the input images may have different motion scales and resolutions.

If you allow too much parameter tweaking in each scale, then this might result in creating a solution that is overfitted to a specific motion scale. There are people who believe that this method is also applied to CNN-based methods. Still, there are some recent cascaded networks that still prefer to use independent parameters for every single scale. They justify this method with a pointer which seems quite plausible. They proposed that sharing networks weights across different scales can significantly deteriorate training difficulty and also introduce stability benefit.

Their experiment shows that how with the help of recurrent structure and the combination of the above advantages, the end-to-end deep image deblurring framework can greatly mend training efficiency. They only use less than 1/3rd of the trainable parameters with faster testing time. Apart from this, their method is proven to produce high-quality results both qualitatively and quantitatively. Let’s not dive deep in the research paper for now. Allow me to present you our use-case of this deblurring technology.

We are well-established Global Digital Trust Company, which functions primarily in the domain of verification processes. For this verification process, our customers have to click photos of their documents and submit it for verification. There are probable chances where these photographs may be blurred either due to camera shake or any motion which causes difficulty in reading the document text.

To solve the blurred image problem, we fed these images in the aforementioned Deblurring Model. The results were exhilarating. Below are some of the samples,

Concluding Remarks

What do you learn from this blog? You learn the use of scale recurrent network for deblurring images. With this technology, you can easily extract data from blurred identity card images. You don’t have to poke your customers again and again for the re-submission of the documents due to bad-quality or blurred images. Thanks for the read and do leave a comment to let me know what you feel about this technology. Adios for now fellas!!!

About Signzy

Signzy is a market-leading platform redefining the speed, accuracy, and experience of how financial institutions are onboarding customers and businesses – using the digital medium. The company’s award-winning no-code GO platform delivers seamless, end-to-end, and multi-channel onboarding journeys while offering customizable workflows. In addition, it gives these players access to an aggregated marketplace of 240+ bespoke APIs that can be easily added to any workflow with simple widgets.

Signzy is enabling ten million+ end customer and business onboarding every month at a success rate of 99% while reducing the speed to market from 6 months to 3-4 weeks. It works with over 240+ FIs globally, including the 4 largest banks in India, a Top 3 acquiring Bank in the US, and has a robust global partnership with Mastercard and Microsoft. The company’s product team is based out of Bengaluru and has a strong presence in Mumbai, New York, and Dubai.

Visit www.signzy.com for more information about us.

You can reach out to our team at reachout@signzy.com

Written By:

Signzy

Written by an insightful Signzian intent on learning and sharing knowledge.

 

How we built a modern, state of the art OCR pipeline — PreciousDory

Finally I am very happy writing this blog after a long wait. As the title suggests PreciousDory is a modern optical character recognition (OCR) engine which performs better than the engines from tech giants like Google, Microsoft, Abby in KYC use cases. We feel now it is time to tell the world how we built this strong OCR pipeline over the last couple of years.

We at Signzy are trying to build a global digital trust system. We solve various fascinating problems related to AI and computer vision. Of them, text extraction from document images was one of the critical problem we had to solve. In the initial phase of our journey we were using traditional rule based OCR pipeline to extract text data from document images. Those OCR engines were not that efficient to compete with global competitors. So In an urge to stay competitive with the global market we took an ambitious decision to build an inhouse modern OCR pipeline. We wanted to build an OCR engine which will surpass the global leaders in that segment.

 

The herculean challenge was out and our AI team accepted it with a bliss. We know building a production ready OCR engine and achieving best in class results is not an easy task. But we are a bunch of gallant people in our AI team. When we started researching about the problem we found very few resources to help us out. And we also stumbled upon the below meme ?

 

If You Can’t Measure It, You Can’t Improve It

The first task our team did was to create a test dataset that would represent all the real world scenarios we could encounter. The scenarios includes varying viewpoints, illumination, deformation, occlusion, background clutter, etc. Below are some samples of our test dataset.

Sample test data

When you have a big problem to solve, break it down into smaller ones

We spent a quite a lot of time in literature study trying to break the problem into sub-problem so that our individual team members could start working on it. We ended with the below macro level architecture.

Macro level architecture

After coming up with the basic architectures our team started exploring the individual entities. Our core OCR engine comprises of 4 key components.

  1. CropNET
  2. RotationNET
  3. Text localizer
  4. Word classifier

CropNET

This is the first step in the OCR pipeline. The input documents for our engine will have a lot of background noise. We needed an algorithm to exactly crop out the region of interest so that the job gets easier in the subsequent steps. In the initial phase we tried out lot of traditional image processing techniques like edge detection, color matching, Hough lines etc. None of them could withstand our test data. Then we took the deep learning approach. The idea was to build a regression model to predict the four edges of the document to be processed. The train data for this model was the ground truth containing the four coordinates of the document. We implemented a custom shallow architecture for predicting the outputs. We achieved good performance from the model.

RotationNET

This is the second stage in the pipeline. After cropping, the next problem to solve is rotation. It was estimated that 5% of the production documents would be rotated in arbitrary angles. But for the OCR pipeline to work properly the document should be at zero degree. To tackle the problem we built a classification model which predicts the angle of document. There are 360 classes corresponding to each degree of rotation. The challenge was in creating the training data. As we had only few real world samples for training each class, we had to build a custom exhaustive pipeline for preparing synthetic training data which closely matches with real world data. Upon training , we achieved impressive results from the model.

Text localizer

The third stage is localizing the text areas. This is the most challenging problem to solve. Given a document the algorithm must be able to localize the text regions for further processing. We knew building this algorithm from scratch is a mammoth task. We benchmarked various open source text detection models on our test datasets.

Text localization — Benchmark

After rigorous testing we decided to go with CTPN. Connectionist Text Proposal Network (CTPN) accurately localizes text lines in natural image. It detects a text line in a sequence of fine-scale text proposals directly in convolutional feature maps. It was developed with a vertical anchor mechanism that jointly predicts location and text/non-text score of each fixed-width proposal, considerably improving localization accuracy. The sequential proposals are naturally connected by a recurrent neural network, which is seamlessly incorporated into the convolutional network, resulting in an end-to-end trainable model. This allows the CTPN to explore rich context information of image, making it powerful to detect extremely ambiguous text.

 

Word classifier

This is the final stage and the most critical step in the OCR engine. This is the step where most of our efforts and time went into. After localizing the text regions in the document, the region of interest was cropped out of the document. Now the final challenge is predict the text from this. Upon rigorous literature study we arrived with two approaches for solving this problem.

  1. Character level classification
  2. Word level classification

Character level

This is one of the traditional approach. In this method the bounding box of individual characters are estimated and from them the characters are cropped out and presented for classification. Now what we have in hand is a MNIST kind of dataset. Building a classifier for this type of task is tried and tested method. But the real challenge in this approach was in building the character level bounding box predictor. Normal segmentation methods failed to perform on our test dataset. We thought of developing a FRCNN like object detection pipeline for localizing the individual characters. But creating the training data for this method was a tedious task and involves a lot of manual work. So we ended up dropping this method.

Word level classifier

This method is based on deep learning. Here we pass the full text localized region into a end to end pipeline and directly get the predicted text. The cropped text region is passed into a CNN for spatial feature extraction and then passed on to RNN for extracting temporal features. We are using CTC loss to train the architecture. CTC loss solves two problems: 1. You can train the network from pairs (Image, Text) without having to specify at which position a character occurs using the CTC loss. 2. You don’t have to postprocess the output, as a CTC decoder transforms the NN output into the final text.

The training data for this pipeline is cropped word image regions and their corresponding ground truth text. Since a large amount of training data was required to make the model converge, we made a separate data creation pipeline. In this we first get the cropped word regions from the document, secondly we feed it into third party OCR engine to get the corresponding text. We used this data to benchmark it against manually created human data. The manual data was again verified by a 2 stage human process to make sure the labels are right.

We achieved impressive results with the model. A sample output from the model.

 

Time for results

At Last we combined all the four key components into a single end to end pipeline. The algorithm now takes an input image of a document and gives the corresponding OCR text as output. Below is a sample input and output of a document.

 

Now the engine was ready to face our quality analysis team for validation. They benchmarked the pipeline against popular global third party OCR engines on our custom validation set. Below are the test results for certain important documents we were handling.

 

We tested our OCR engine against other top engines on different scenarios. It includes cases with no background, different background, high brightness and low brightness. The results shows that we are able to perform better than the popular known OCR engines in most scenarios.

Productionzation

The pipeline was built now and tested. But still it was not ready to face the real world. Some of the challenges in productionsing the system are listed below.

  1. Our OCR engine was using GPU for inference. But since we wanted the solution to be used by our clients without any change in their infrastructure, we removed all the GPU dependencies and rewrote the code to run in CPU.
  2. To serve large number of requests more efficiently we builded a queueing mechanism.
  3. For easier integration with existing client infrastructures, we provided the solution as a REST API
  4. Finally the whole pipeline was containerized to ease the deployment at enterprises.

Summary

Thus a mammoth of task building a modern OCR pipeline was accomplished. A special thanks to my team members Nishant and Harshit for making this project successful. One of the key take away from the project was that if you have an exciting problem and a passionate team in hand, you could make the impossible possible. And I could not explain a lot of steps in details since I had to keep the blog short. Do write to me if you have any queries.

About Signzy

Signzy is a market-leading platform redefining the speed, accuracy, and experience of how financial institutions are onboarding customers and businesses – using the digital medium. The company’s award-winning no-code GO platform delivers seamless, end-to-end, and multi-channel onboarding journeys while offering customizable workflows. In addition, it gives these players access to an aggregated marketplace of 240+ bespoke APIs that can be easily added to any workflow with simple widgets.

Signzy is enabling ten million+ end customer and business onboarding every month at a success rate of 99% while reducing the speed to market from 6 months to 3-4 weeks. It works with over 240+ FIs globally, including the 4 largest banks in India, a Top 3 acquiring Bank in the US, and has a robust global partnership with Mastercard and Microsoft. The company’s product team is based out of Bengaluru and has a strong presence in Mumbai, New York, and Dubai.

Visit www.signzy.com for more information about us.

You can reach out to our team at reachout@signzy.com

Written By:

Signzy

Written by an insightful Signzian intent on learning and sharing knowledge.

 

Democratizing AI using Live Face Detection

Democratizing AI using Live Face Detection

Democratizing AI using Live Face Detection!  Since the dawn of AI, facial recognition systems have been evolving rapidly to exceed our expectations at every turn. In a few years, you’ll be able to go through the airport basically just using your face. If you have bags to drop off, you’ll be able to use the self-service system and just have your face captured and matched. You’ll then go to security, the same thing happens just use your biometric. The big tech giants have proved this can be done on a massive scale. The world now needs higher adoption through the democratization of this technology, where even small organizations can use this advanced technology with a plug-and-play solution.

The answer to this is Deep Auth, Signzy’s in-house facial recognition system. This allows large-scale face authentication in real-time, using your everyday mobile device cameras in the real world.

Democratizing AI using Live Face Detection

Deep Auth, Facial Recognition System from Signzy

While a one-to-one face match is now very popular (thanks to the latest Apple iPhone X), it’s still not easy to authenticate people from larger datasets that identify you from thousands of other images. What is even more challenging is doing this in real-time. And just to add some bit of realism, sending images and videos over mobile internet slows this down even further.

This system can detect and recognize faces in real-time in any event, organization, office space without any special device. This makes Deep Auth an ideal candidate to use in real-world scenarios where it might be not possible to deploy a large human workforce or spend millions of dollars to monitor people and events. Workplaces, Education Institutes, Bank branches even large residential buildings are all valid areas of use.

Digital journeys can benefit from face-based authentication thus eliminating the friction of username, password, and adding the security of biometrics. There can also be hundreds of other use-cases which hopefully our customers will come up with, and help us improve our tech.

Democratizing AI using Live Face Detection

 

Deep Auth doing door access authorization.

Deep Auth is robust to appearance variations like sporting a beard,, or wearing eyeglasses. This is made possible by ensuring that Deep Auth learns the facial features dynamically (Online training).

Democratizing AI using Live Face Detection

 

Deep Auth working across different timelines

Technology

The technology behind face recognition is powered by a series of Convolution Neural Networks(CNN). Let’s divide the tech into two parts :

  • Face Detection
  • Face Recognition

Face Detection:

This part involves a 3 stage cascaded CNN network. This is to ensure the face is robustly detected. In the first stage, we propose regions (Objectablility score) and their regression boxes. In the second stage, we take these proposed regression boxes as the input and then re-propose them to reduce the number of false positives. Non-maximal suppression is applied after each stage to further reduce the number of false positives.

Democratizing AI using Live Face Detection

3 stage cascaded CNN for face detection.

In the final stage, we compute the facial landmarks with 5 point localization for both the eyes, nose, and the edges of the mouth. This stage is essential to ensure that the face is aligned before we pass it to the face recognizer. The loss function is an ensemble of the center loss and IoU (Intersection Over Union) loss. We trained the network for 150k iterations on the WIDER Face dataset.

Face Recognition:

The extracted faces are then passed to a siamese network to where we use a contrastive loss to converge the network. The siamese network is a 152 layer Resnet where the output is a 512-D vector depicting the encodings of the given face.

 

Democratizing AI using Live Face Detection

Resnet acts as the backbone for the siamese network.

We then use K- Nearest Neighbours(KNN) to classify each encoding to the nearest face encodings that were injected to KNN during the training phase. The 512-D vectorization used here compared to 128-D vectorization used in other face recognition systems helps in distinguishing fine details across each face. This provides high accuracy to the system even with a large number of non-discriminative faces. We are also working on extending the siamese network to extract 1024-D face encodings.

Benchmarks

Deep Auth poses impressive metrics on the FDDB database. We use 2 images to train each of 1678 distinct faces and then evaluate the faces with the rest of the test images. We then calculate the Precision and recall as 99.5629 and 91.2835 respectively, and with the F1 score of 95.2436.

Democratizing AI using Live Face Detection

 

Deep Auth’s Impressive scores!

We also showcase Deep Auth working in real-time, by face matching faces in a video.

Deep Auth in Action!

We tried something a little more cheeky and got our hands on a picture of our twin co-founders posing together, a rare sight indeed! And checked how good the Deep Auth really was. Was it able to distinguish between identical twins?

Democratizing AI using Live Face Detection

 

And Voila! It worked

Deep Auth is accessed using the REST API interface making it suitable for online training and real-time recognition. Deep Auth is self-servicing due to the fact it is robust to aging and appearance, which makes it an ideal solution to deploy in remote areas.

Conclusion

Hopefully, this blog was able to explain more about Deep Auth and the technology behind it. Ever since UIDAI made face recognition mandatory for Aadhaar authentication, face recognition will start to prevail in every nook and corner of the nation for biometric authentication. Thus democratization of face authentication allows even small companies to access this technology within their organizations. Hopefully, this should allow more fair play and give everyone a chance to use advanced technology to improve their lives and businesses.

In the next blog, we will explain how we have paired face recognition with spoof detection to make Deep Auth robust to spoof attacks. Please keep reading more on our AI section to understand how this is done.

About Signzy

Signzy is a market-leading platform redefining the speed, accuracy, and experience of how financial institutions are onboarding customers and businesses – using the digital medium. The company’s award-winning no-code GO platform delivers seamless, end-to-end, and multi-channel onboarding journeys while offering customizable workflows. In addition, it gives these players access to an aggregated marketplace of 240+ bespoke APIs that can be easily added to any workflow with simple widgets.

Signzy is enabling ten million+ end customer and business onboarding every month at a success rate of 99% while reducing the speed to market from 6 months to 3-4 weeks. It works with over 240+ FIs globally, including the 4 largest banks in India, a Top 3 acquiring Bank in the US, and has a robust global partnership with Mastercard and Microsoft. The company’s product team is based out of Bengaluru and has a strong presence in Mumbai, New York, and Dubai.

Visit www.signzy.com for more information about us.

You can reach out to our team at reachout@signzy.com

Written By:

Signzy

Written by an insightful Signzian intent on learning and sharing knowledge.

Oracle Financial Services Hackathon – Ankit Ratan, Co-Founder, Signzy

https://youtu.be/ElxYD-8h7m4

Oracle Financial Services is powering innovation with industry leading platforms built on modern, open and intelligent technology. We are reimagining banking by bringing together 13 FinTechs startups to be part of it at the Demo Pitch Day at Oracle Industry Connect 2018.

About Signzy

Signzy is a market-leading platform redefining the speed, accuracy, and experience of how financial institutions are onboarding customers and businesses – using the digital medium. The company’s award-winning no-code GO platform delivers seamless, end-to-end, and multi-channel onboarding journeys while offering customizable workflows. In addition, it gives these players access to an aggregated marketplace of 240+ bespoke APIs that can be easily added to any workflow with simple widgets.

Signzy is enabling ten million+ end customer and business onboarding every month at a success rate of 99% while reducing the speed to market from 6 months to 3-4 weeks. It works with over 240+ FIs globally, including the 4 largest banks in India, a Top 3 acquiring Bank in the US, and has a robust global partnership with Mastercard and Microsoft. The company’s product team is based out of Bengaluru and has a strong presence in Mumbai, New York, and Dubai.

Visit www.signzy.com for more information about us.

You can reach out to our team at reachout@signzy.com

Written By:

Signzy

Written by an insightful Signzian intent on learning and sharing knowledge.

 

August: Recent News & amp; Updates from Signzy

FinTech Vendor of the Year 2018 by Frost and Sullivan

 

Signzy was recognised as “FinTech Vendor of the Year in Banking” under Emerging Services category . Frost & Sullivan’s 2018 India ICT Awards aim to honour and recognise companies that have pushed the boundaries of excellence — rising above the competition and achieving landmarks to deliver business outcomes using digital and disruptive technologies in the Indian market.We will continue to strive hard to take this faith global and beyond the financial industry. Read Here.

Signzy was part of a select few to attend MasterCard’s Start Path Program

 

Signzy was chosen among startups globally for MasterCard Start Path Global Programme.This program began on June 25th with a three-day in-person Kick-Off event, which was held in New York City. Signzy’s Arpit was in New York for the same.We feel so humbled to have been chosen for this prestigious program which has enabled us to showcase our work in simplifying complex regulatory processes in financial institutions and in enabling them to transition towards a fully digital experience.

1st Runner Up at the Innovation Challenge as part of Mumbai FinTech Festival 2018

We are honoured to be the 1st Runner Up among 800 participating companies at the Innovation Challenge organised by the Government of Maharashtra as a part of Mumbai FinTech Festival 2018 held in the month of June this year. Signzy’s Co-Founder, Arpit Ratan was presenting at the occasion which was graced by the Honourable Chief Minister of Maharashtra, Shri Devendra Fadnavis.

Signzy listed amongst the top 7 RegTech Companies in APAC region by MEDICI

We are glad to announce that Signzy has been included in the list of 21 RegTech Companies and is listed amongst the top 7 in APAC region — Building the Future of Regulatory Compliance by MEDICI. The MEDICI TOP 21 list is an industry recognised FinTech distinction and an initiative to find the best 21 companies within a FinTech industry segment. It feels great to be listed along with companies like AIDA Technologies, Identitii, Jocata, Jewel Paymentech and other global players. Read here.

Signzy was selected into JioGenNext startup program by Reliance

Signzy was one of the 9 companies that was selected in JioGenNext startup program by Reliance this year. We are excited to work with Reliance in bringing truly transformative Financial products for Jio ecosystem. Signzy’s Ankit explained how we are helping banks solve customer authentication and on-boarding problems with a blockchain based solution.

Signzy listed amongst 13 Indian Blockchain Startups To Watch Out For In 2018 by inc42

This article is part of Inc42’s Startup Watchlist annual series where they list the top startups to watch out for 2018 from industries like AI, Logistics, Blockchain etc. Read here.

Survey of facial feature descriptors

From our blog:

Aswe grow our AI capabilities, we believe we should contribute actively back to the ecosystem. The following blog might be useful to AI practitioners looking to work on Facial Technology. Read here.

Product Update: How we embraced the power of Artificial Intelligence to replace Legacy Banking processes

Last month we went live with one of India’s major public sector banks which involved high scale data processing for CKYC compliance for all it’s pan-India bank accounts. During the pilot run the branches were very excited to see their problem of image sorting and CKYC compliance solved through Artificial Intelligence. The implementation of this project at the pan-India level is expected to reduce the processing time per application by 87% and lead to a reduction of operational cost of about $17 million dollars for the bank. Read here.

Events we attended

About Signzy

Signzy is a market-leading platform redefining the speed, accuracy, and experience of how financial institutions are onboarding customers and businesses – using the digital medium. The company’s award-winning no-code GO platform delivers seamless, end-to-end, and multi-channel onboarding journeys while offering customizable workflows. In addition, it gives these players access to an aggregated marketplace of 240+ bespoke APIs that can be easily added to any workflow with simple widgets.

Signzy is enabling ten million+ end customer and business onboarding every month at a success rate of 99% while reducing the speed to market from 6 months to 3-4 weeks. It works with over 240+ FIs globally, including the 4 largest banks in India, a Top 3 acquiring Bank in the US, and has a robust global partnership with Mastercard and Microsoft. The company’s product team is based out of Bengaluru and has a strong presence in Mumbai, New York, and Dubai.

Visit www.signzy.com for more information about us.

You can reach out to our team at reachout@signzy.com

Written By:

Signzy

Written by an insightful Signzian intent on learning and sharing knowledge.

 

Survey of facial feature descriptors

Face recognition technology has always been a concept that lived in fictional worlds, whether it was a tool to solve a crime or open doors. Today, our technology in this field has developed significantly as we are seeing it become more common in our everyday lives. In the mission of building a truly digital trust system, we at Signzy use Facial recognition technology to identify and authenticate individuals. The technology is able to perform this task in three steps: detecting the face, extracting features from the target, and finally matching and verifying. As a visual search engine tool, this technology is able to identify key factors within the given image of the face.

To pioneer our facial recognition technology, we wanted an edge over the current deep learning-based facial recognition models. Our idea was to embed human crafted knowledge into state of art CNN architectures to improve their accuracy. For that, we needed to do an extensive survey of the best facial feature descriptors. In this blog, we have shared a part of our research that describes some of the features.

Local binary patterns

LBP looks at points surrounding a central point and tests whether the surrounding points are greater than or less than the central point (i.e., gives a binary result). This is one of the basic and simple feature descriptors.

Gabor wavelets

They are linear filters used for texture analysis, which means that it basically analyses whether there are any specific frequency content in the image in specific directions in a localized region around the point or region of analysis.

 

 

Gabor jet similarities

These are the collection of the (complex-valued) responses of all Gabor wavelets of the family at a certain point in the image. The Gabor jet is a local texture descriptor, that can be used for various applications. One of these applications is to locate the texture in a given image. E.g., one might locate the position of the eye by scanning over the whole image. At each position in the image, the similarity between the reference Gabor jet and the Gabor jet at this location is computed using a bob.ip.gabor.Similarity.

Local phase quantisation

The local phase quantization (LPQ) method is based on the blur invariance property of the Fourier phase spectrum. It uses the local phase information extracted using the 2-D DFT or, more precisely, a short-term Fourier transform (STFT) computed over a rectangular M-by-M neighborhood at each pixel position x of the image f(x) defined by:

where Wu is the basis vector of the 2-D Discrete Fourier Transforms (DFT) at frequency u, and fx is another vector containing all M2 image samples from Nx.

Difference of Gaussians

It is a feature enhancement algorithm that involves the subtraction of one blurred version of an original image from another, less blurred version of the original. In the simple case of grayscale images, the blurred images are obtained by convolving the original grayscale images with Gaussian kernels having differing standard deviations. Blurring an image using a Gaussian kernel suppresses only high-frequency spatial information. Subtracting one image from the other preserves spatial information that lies between the range of frequencies that are preserved in the two blurred images. Thus, the difference of Gaussians is a band-pass filter that discards all but a handful of spatial frequencies that are present in the original grayscale image. Below are few examples with varying sigma ( standard deviation ) of the Gaussian kernel with detected blobs.

 

Histogram of gradients

The technique counts occurrences of gradient orientation in localized portions of an image. The idea behind HOG is that local object appearance and shape within an image can be described by the distribution of intensity gradients or edge directions. The image is divided into small connected regions called cells, and for the pixels within each cell, a histogram of gradient directions is compiled. The descriptor is the concatenation of these histograms.

 

FFT

Fourier Transform is used to analyze the frequency characteristics of various filters. For images, 2D Discrete Fourier Transform (DFT) is used to find the frequency domain. For a sinusoidal signal,

we can say f is the frequency of signal, and if its frequency domain is taken, we can see a spike at f. If signal is sampled to form a discrete signal, we get the same frequency domain, but is periodic in the range

or

( or for N-point DFT ).

You can consider an image as a signal which is sampled in two directions. So taking Fourier transforms in both X and Y directions gives you the frequency representation of the image.

Blob features

These methods are aimed at detecting regions in a digital image that differ in properties, such as brightness or color, compared to surrounding regions. Informally, a blob is a region of an image in which some properties are constant or approximately constant; all the points in a blob can be considered in some sense to be similar to each other.

CenSurE features

This feature detector is a scale-invariant center-surround detector (CENSURE) that claims to outperform other detectors and gives results in real-time.

ORB features

This is a very fast binary descriptor based on BRIEF, which is rotation invariant and resistant to noise.

Dlib — 68 facial key points

This is one of the most widely used facial feature descriptors. The facial landmark detector included in the dlib library is an implementation of the One Millisecond Face Alignment with an Ensemble of Regression Trees paper by Kazemi and Sullivan (2014). This method starts by using:

  1. A training set of labeled facial landmarks on an image. These images are manually labeled, specifying specific (x, y)-coordinates of regions surrounding each facial structure.
  2. Priors, of more specifically, the probability of distance between pairs of input pixels.

Given this training data, an ensemble of regression trees is trained to estimate the facial landmark positions directly from the pixel intensities themselves (i.e., no “feature extraction” is taking place). The end result is a facial landmark detector that can be used to detect facial landmarks in real-time with high-quality predictions.

Code: https://www.pyimagesearch.com/2017/04/17/real-time-facial-landmark-detection-opencv-python-dlib/

Conclusion

Thus in this blog, we compile different facial features along with its code snippet. Different algorithms explain different facial features. The selection of the descriptor which gives high performance is truly based on the dataset in hand. The dataset’s size, diversity, sparsity, complexity plays a critical role in the selection of the algorithm. These human engineered features when fed into the convolution networks improve their accuracy.

About Signzy

Signzy is a market-leading platform redefining the speed, accuracy, and experience of how financial institutions are onboarding customers and businesses – using the digital medium. The company’s award-winning no-code GO platform delivers seamless, end-to-end, and multi-channel onboarding journeys while offering customizable workflows. In addition, it gives these players access to an aggregated marketplace of 240+ bespoke APIs that can be easily added to any workflow with simple widgets.

Signzy is enabling ten million+ end customer and business onboarding every month at a success rate of 99% while reducing the speed to market from 6 months to 3-4 weeks. It works with over 240+ FIs globally, including the 4 largest banks in India, a Top 3 acquiring Bank in the US, and has a robust global partnership with Mastercard and Microsoft. The company’s product team is based out of Bengaluru and has a strong presence in Mumbai, New York, and Dubai.

Visit www.signzy.com for more information about us.

You can reach out to our team at reachout@signzy.com

Written By:

Signzy

Written by an insightful Signzian intent on learning and sharing knowledge.

 

Updates from Signzy

Updates from Signzy and a few useful reads!

Keep up-to-date with Signzy’s newest events, blog posts, and industry initiatives. Plus, enrich your fintech understanding with our curated selection of must-read articles and reports from experts worldwide. Your comprehensive guide to staying ahead in the fintech landscape.

Updates from Signzy: Signzy’s Netra Team became runner-up at the IDRBT IBTIC

Signzy’s Netra Team became runner-up at the IDRBT Banking Application Contest, 2018. We competed with the worlds largest IT organisations and banks for technology implementations and were judged by CIOs of Indias largest banks. We’re so glad to have received this recognition. We’ll continue to work even harder towards our vision of transforming traditional banking processes into a fully digital experience. Read here.

Signzy listed amongst the 10 RegTech Companies Making Waves in the Industry

 

We’ve been included in the list of the 10 RegTech Companies Making Waves in the Industry by Disruptor Daily — a publication that reports on groundbreaking and innovative technologies, trends and companies. It feels great to be listed along with companies like AYASDI, Feedzai, Forcepoint, Provenir and others. We’ll strive to build innovative solutions using AI to transform current semi-manual processes in financial institutions into real-time digital systems. Thereby making regulatory processes simple, secure yet compliant for these institutions. Read here.

Events we attended

 

Updates from Signzy

Innovation And Startup Connect Event For Global Capability Centers (GCCs): We attended the NASSCOM’s Product Conclave’s — Innovation and Startup Connect Event for Global Capability Centers at Bengaluru. The event brought together the best in the product ecosystem connecting GCCs and startups to accelerate innovation and digital transformation. Signzy’s Ankit explained top institutions like Sony India, Target, and Samsung about how Signzy is using the power of AI to transform traditional banking into a fully digital experience. (16th March)

Tech in Asia Blockchain meetup: We were at the Tech in Asia Blockchain meetup at Bengaluru. Tech enthusiasts, experts, and founders explored the key verticals of blockchain technology at the meetup. Signzy’s Ashish was a part of the panel and discussed about the potential, use cases and controversies surrounding blockchain. (28th March)

Oracle Industry Connect: We attended the Oracle Industry Connect at New York, Midtown Hilton. The event brought together thought leaders and top execs and offered thought-provoking ideas and insights to address industry-specific challenges. Signzy’s Ankit discussed the implications of user privacy and data ownership — the key themes that will drive digital customer onboarding journeys at the event. (10th-11th April)

Asian Development Bank Event: We presented at the Asian Development Bank’s Event in Vietnam. Honoured to be helping in bringing about the digital revolution in Vietnam. Signzy’s Arpit talked about digitising customer onboarding, doing e-kyc, and making banking more efficient at this event. (11th-12th April)

IDRBT Hyderabad Meeting: We were at the IDRBT meet at Hyderabad. RBI’s IDRBT initiative was about fast tracking the development of innovative fintech solutions solving complex regulatory, compliance, and other industry challenges. Signzy was among the few select fintech companies RBI/IDRBT sought inputs from. Arpit from Signzy shared his insights towards building a fast-moving fintech ecosystem (16th April)

Future of Business Conclave: We were a part of the panel at the Future of Business Conclave by Cisco and YourStory on innovation-driven digitisation at Mumbai. Signzy’s Arpit explained-although digitisation is the new norm, it can’t encompass everything — operations like customer support must remain humanised at the event. (28th April)

Tech Updates from Signzy: How we replaced legacy banking processes with AI-driven technology

From our blog:

 

Updates from Signzy

How we replaced legacy banking processes with AI-driven technology — A detailed article on power of deep learning explaining how AI transforms banking operations. Read here.

An approach to data privacy for Indian banks and financial institutions

 

An approach to data privacy for Indian banks and financial institutions — A must read explaining how Indian banks and financial institutions can approach data privacy despite the lack of regulations. Read here.

Industry News Updates from Signzy: Store data locally, RBI directs payment facilitators

The RBI released a notification asking financial technology companies to store all the data related to payments and transactions within India alone. Read on to know the impact the current directive has on the fintech companies storing/processing data. Check out the full story here.

About Signzy

Signzy is a market-leading platform redefining the speed, accuracy, and experience of how financial institutions are onboarding customers and businesses – using the digital medium. The company’s award-winning no-code GO platform delivers seamless, end-to-end, and multi-channel onboarding journeys while offering customizable workflows. In addition, it gives these players access to an aggregated marketplace of 240+ bespoke APIs that can be easily added to any workflow with simple widgets.

Signzy is enabling ten million+ end customer and business onboarding every month at a success rate of 99% while reducing the speed to market from 6 months to 3-4 weeks. It works with over 240+ FIs globally, including the 4 largest banks in India, a Top 3 acquiring Bank in the US, and has a robust global partnership with Mastercard and Microsoft. The company’s product team is based out of Bengaluru and has a strong presence in Mumbai, New York, and Dubai.

Visit www.signzy.com for more information about us.

You can reach out to our team at reachout@signzy.com

 

Data privacy

Data privacy for Banks & Financial Institutions

About 85 countries in the world have their data privacy policies in place. Sadly, India isn’t one of them. While the Information Technology Act, 2000 does touch upon privacy policies, it’s hardly sufficient. The countries that have data privacy regimes are also evolving their models to suit the BIG DATA wave. For example, in the US, where user data privacy is protected under a bunch of legislations like the Children’s Online Privacy Protection Act, the Gramm-Leach-Bliley Act for financial information, the California Online Privacy Protection Act in California, etc is still looking for more a better way to regulate.

Comparing the US, the framework with the one from EU, Michelle De Mooy, the director for privacy and data at the Center for Democracy & Technology, explains that Europe has a “people-first mentality” that’s ”more than we do here in our capitalist society, where innovation is sort of equated with letting businesses do whatever they need to grow. That has translated into pretty weak data protection.

EU is tightening its laws further with the upcoming GDPR. It has already got companies hustling to making their privacy policies compliant with the new laws. As the world gears up for a more stringent GDPR, let’s look at how Indian banks and financial institutions can approach data privacy despite the lack of regulations.

Failing on the data privacy score

Most banks and financial companies are committed to maintaining their data integrity and protect it against breaches. However, the same isn’t true when it comes to ensuring security & privacy. You could say that there’s some degree of laxity. Blame it on the “largely self-regulated” privacy guidelines or the “depends-on-the-context” grounds, but banks and financial institutions offering both data security and privacy are few.

In a global survey of more than 180 senior data privacy and security professionals, Capgemini found that lesser than 29% of them “offered both strong data privacy practices and a sound security strategy.

 

What makes the situation more serious is that today’s banks use a giant tech ecosystem with partners sharing data to build better digital experiences for the end users. As data exchanges hands and lives in multiple places, the risk of data privacy breaches increases. This calls for an even more robust and thorough data privacy regime applying to the entire banking and fintech ecosystem.

But without much legal guidance on approaching data privacy, banks and financial institutions too are forced to take the self-regulation route just like the cryptocurrency businesses. Here’s how banks can handle data privacy until the regime gets regulated.

Self-regulation

While the data privacy laws are ever-evolving, some best and practice data privacy practices can prepare banks and financial institutions for the time when the laws and policies are actually formulated. PwC offers 6 excellent action points for financial institutions to use when handling data privacy:

  • Define privacy as primarily a legal and compliance regulatory matter.
  • Create a privacy office that develops privacy guidelines and interfaces with other stakeholders. If the financial institution does not currently have a separate privacy office, we recommend for the institution to hold an internal “privacy summit” that convenes key stakeholders from the lines of business, technology, compliance, and legal.
  • Identify and understand what the data is, where it resides, how it is classified, and how it flows through various systems. For example, financial, medical, and PII are subject to different restrictions in different jurisdictions.
  • Develop appropriate global data-transfer agreements for PII and other data that falls under privacy requirements.
  • Recognize and adhere to requirements when developing core business processes and cross-border data flows.
  • Preserve customer trust as the primary goal.

McKinsey & Company recommend another great tactic for approaching data privacy that companies can adopt to become data stewards. This strategy is of creating a “golden record” of every personal-data processing activity in a company to ensure compliance and traceability that goes “beyond documenting the system inventory and involves maintaining a full record of where all personal data comes from, what is done with them, what the lawful grounds for processing are, and whom the data are shared with.“

This tactic applies seamlessly to banks and financial institutions. They can start off by building records of what data they collect from their users and how the sharing with their tech partners happens — all of this while ensuring users’ consent for all their operations using the data.

In fact, in addition to self-regulating the data collection, usage, and sharing regime, banks must also build a data privacy taskforce that’s committed to ensuring compliance with the internal data privacy framework.

With the right records, resources, banks, and financial institutions must also see how they can ensure data privacy into their services and offerings by design and by default.

At Signzy, we don’t just view user data privacy proactiveness as a risk management strategy, but we see it as a core building block of a digital trust system. It’s a competitive advantage. We believe that data privacy inspires trust. And when we build digital solutions to tackle challenging legacy financial processes, we make sure that our solutions are structured in a way that user data privacy isn’t compromised while balancing both user expectations and regulatory compliance.

Wrapping it up

Although privacy is a largely law-regulated — and we currently lack the laws — it’s still not optional. And it goes way beyond just seeking the users’ consent for collecting and storing the information. While banks and financial institutions can’t probably go so far as to give their users the “right to erasure” or the “right to be forgotten,” they can surely embrace data privacy as the norm. With stringent self-regulation measures, Indian banks and financial companies can contribute to building trust and transparency in the Indian digital banking scenario until the laws get formulated.

About Signzy

Signzy is a market-leading platform redefining the speed, accuracy, and experience of how financial institutions are onboarding customers and businesses – using the digital medium. The company’s award-winning no-code GO platform delivers seamless, end-to-end, and multi-channel onboarding journeys while offering customizable workflows. In addition, it gives these players access to an aggregated marketplace of 240+ bespoke APIs that can be easily added to any workflow with simple widgets.

Signzy is enabling ten million+ end customer and business onboarding every month at a success rate of 99% while reducing the speed to market from 6 months to 3-4 weeks. It works with over 240+ FIs globally, including the 4 largest banks in India, a Top 3 acquiring Bank in the US, and has a robust global partnership with Mastercard and Microsoft. The company’s product team is based out of Bengaluru and has a strong presence in Mumbai, New York, and Dubai.

Visit www.signzy.com for more information about us.

You can reach out to our team at reachout@signzy.com

Written By:

Signzy

Written by an insightful Signzian intent on learning and sharing knowledge.

 

Global digital trust system

How we replaced legacy banking processes with AI-driven technology

Signzy — Building Global digital trust system using AI & Blockchain

One such interesting use case we encountered recently was about an id verification software. Given an image of an identity card the algorithm has to classify it to one of the following classes..

  1. Aadhaar
  2. PAN
  3. Driving License
  4. Passport
  5. Voter Id

In this blog post we will take you to behind-the-scenes of our state-of-the-art system and how we tackled the problem, ultimately overpassing the targeted accuracy required for real world use.

Knowing the beast we are to fight

As soon as we began to dive deeper into understanding the problem and identifying techniques we would use to attack it, we realised the most important constraints of the id verification software that we had to work within and the aim we are striving to achieve.

The idea is to deploy the pipeline into financial institutions with all possibilities of input variation and yet it should surpass or at least be equivalent to accuracy of a human being. The solution is to work on data which arrives from the most rural parts with pics taken from even 0.3 MegaPixel cameras and travelling over a dramatically slow connectivity. We knew the toughest challenge was to cater to variations that could arrive in inputs.

Humans have evolved intelligence for thousands of years, and created the systems to be easily processed by themselves. Take for instance an identity card. It is designed in dimensions to sit in pocket wallet, color formats to be more soothing to human eyes, data format which could sit well read by humans. If the Identity cards were designed to be consumed by a computer vision software it would have been an easier game, but since that’s not the case it becomes especially challenging.

We talked with different on-ground stakeholders to identify variations in input to the id verification software. Collecting initial samples wasn’t that hard, since a lot of these variations were told by our end users, but we knew creating training data is not going to be easy. We realized this quickly and started creating exhaustive training data in heavily curated and precisely controlled laboratory settings. We were able to get desired training sets successfully, which was half the problem solved.

World is not the cozy laboratory, we know that!

Our target was to create an id verification software which could be more than 99% accurate and yet be fast enough to make an impact. This isn’t easy when you know your input is coming from the rural end of India and you won’t have high end GPUs to process on (As a matter of fact, our largest implementation of this solution runs without GPUs).

 

A gist of environment where our input is created

The id verification app is expected to perform well in different sorts of real world scenarios like varying viewpoints, illumination, deformation, occlusion, background clutter, less inter-class variation, high intra-class variation (eg. Driving License).

You can’t reject an application by an old rural lady, who has brought you a photocopy of printout which in turn is obtained from a scanned copy of a long faded PAN card. We took it as a challenge to create the system so that it can help even the rural Indian masses.

A few samples that we expect as input into our system are here:

 

Fig(1): Few samples our expected input data

The number of samples we have for training is a huge constraint, you only have so much time and resources to prepare your training data.

Creating the id verification software

Baby steps ahead

We tried out various online identity verification methods for solving the problem. Firstly we extracted features using Histogram of Oriented Gradients (HOG) feature extractor from OpenCV and then trained a Support Vector Machine (SVM) classifier on top of the extracted features. The results were further improved by choosing XGBoost classifier. We were able to reach about 72% accuracy. We were using Scikit learn machine learning framework for this.

 

Not enough, let’s try something else

In our second approach, we tried ‘Bag of words’ model where we had built a corpus containing unique words from each identity card. Then we feed the test identity cards to an inhouse developed OCR pipeline to extract text from the identity card. Finally we input the extracted text to a ‘Naive bayes’ classifier for the predictions. This method boosted the accuracy to 96% . But the drawback of this approach was that it can be easily fooled by hand written text.

 

 

Taking the deep learning leap

“The electric light did not come from the continuous improvement of candles.” — Oren Harari

In the next approach we trained a classical Convolutional Neural Network for this image classification task. We benchmarked various existing state of the art architectures to find out which works best for our dataset eg. Inception V4, VGG-16, ResNet, GooLeNet. We also tried on RMS prop and Stochastic Gradient Descent optimizers which did not turn out to be good. We finalized on ResNet 50 with Adam optimizer, learning rate of 0.001 & decay of 1e-5. But since we had less data our model could not converge. So we did a transfer learning from “Image net”, where we used the existing weights trained originally on 1 million images. We replaced the last layer with our identity labels and freezed the remaining layers and trained. We noted that still our validation error was high. Then we ran 5 epochs with all layers unfreezed. Finally we reached accuracy of around 91%. But still we were lagging by 9% from our target.

 

Hit the right nail kid, treat them as objects

The final approach is where the novelty of our algorithm lies. The idea is to use an image object detector ensemble model for image classification purpose. For eg. the Aadhaar identity has Indian Emblem, QR code objects in it. We train an object detector for detecting these objects in card and on presence with a certain level of confidence we classify it as a Aadhaar. Like this we found 8 objects which were unique to each identity. We trained on state of the art Faster Region Proposal CNN (FRCNN) architecture. The features maps are extracted by a CNN model and fed into a ROI proposal network and a classifier. The ROI network tries to predict the object bounding box and the classifier (Softmax) predicts the class labels. The errors are back propagated by ‘softmax L2 loss function’. We got good results on both precision and recall. But still the network was performing bad on rotated images. So we rotated those 8 objects in various angles and trained again on it. Finally we reached an accuracy of about 99.46% . We were using Tensorflow as the tool.

Fig(7): FRCNN architecture from original paper

 

 

But we were yet to solve one final problem i.e the execution time. It took FRCNN approximately 10 seconds to classify in a 4 core CPU. But the targeted time was 3 seconds. Because of the ROI pooling the model was slow. We explored and found out that Single shot multibox detector (SSD) architecture is much faster than FRCNN as it was end-to-end pipeline with no ROI layer. We re-trained the model in this architecture. We reached accuracy of about 99.15%. But our execution time was brought down to 2.8s.

Fig(12): SSD architecture from original paper

Good work lad! What next?

While the pipeline we had come up with till here has a very high accuracy and efficient processing time, it was yet far from the a productionised software. We conducted multiple rounds of quality checks and real world simulation on the entire pipeline. Fine-tuning the most impactful parameters and refining the stages, we have been recently been able to develop a production ready, world class classifier with an error rate less than human and at a much much lesser cost.

We are clearly seeing the impact deep learning can have on solving these problems which we once were unable to comprehend through technology. We were able to gauge the huge margin of enhancement that deep learning provides over traditional image processing algorithms. It’s truly the new technological wave. And that’s for good.

In the upcoming posts, we will share our story on how we tackled another very difficult problem — Optical Character Recognition (OCR). We are competing with global giants in this space including Google, Microsoft, IBM and Abby and clearly surpassing them in our use cases. We have a interesting story to tell over “How we became the global best in enterprise grade OCR“. Stay tuned.

Thank you.

Signzy AI team

Be part of our awesome journey

Do you believe that the modern world is driven by bits and bytes? And think you can take it on? We are looking for you. Drop us a note at careers@signzy.com.

Summary view

  1. Real world is not your laboratory, training data needs to be diverse and needs better outlier handling
  2. Deep learning requires you to be patient but once it starts getting effective it gives your exponential returns
  3. In a narrow use case you can beat a global giant with all the computing power in the world.

So future of deep learning is not commoditized products but adoption of deep learning in use cases as a tool to bring intelligence across the board. Deep Learning has to be company culture and not just a ‘tool’.

About Signzy

Signzy is a market-leading platform redefining the speed, accuracy, and experience of how financial institutions are onboarding customers and businesses – using the digital medium. The company’s award-winning no-code GO platform delivers seamless, end-to-end, and multi-channel onboarding journeys while offering customizable workflows. In addition, it gives these players access to an aggregated marketplace of 240+ bespoke APIs that can be easily added to any workflow with simple widgets.

Signzy is enabling ten million+ end customer and business onboarding every month at a success rate of 99% while reducing the speed to market from 6 months to 3-4 weeks. It works with over 240+ FIs globally, including the 4 largest banks in India, a Top 3 acquiring Bank in the US, and has a robust global partnership with Mastercard and Microsoft. The company’s product team is based out of Bengaluru and has a strong presence in Mumbai, New York, and Dubai.

Visit www.signzy.com for more information about us.

You can reach out to our team at reachout@signzy.com

Written By:

Signzy

Written by an insightful Signzian intent on learning and sharing knowledge.

 

1 21 22 23 24 25