Proof of Identity Document Capture and Verification

Datafinity ID Vision

AI for proof of ID capture and verification

What is ID Vision

  • ID Vision is an AI-driven software component that enables intelligent capture of proof of ID document types.
  • Datafinity have incorporated Machine\Deep Learning and Computer Vision tools to assist with building a Neural Network for RSA Proof of ID documents types.
  • ID Vision Neural Network contains trained models that can classify and read data from proof of ID documents like RSA ID books, smart cards, driver’s license, and passports.

How it Works – Steps

Input Images

ML MODEL

  • Input images goes “as is”, without any preprocessing or clean up.
  • Sample images should contain varying qualities, orientations, skewness, and sizes.

Learn image category & boundaries

NORMALIZE

  • Annotate each image by drawing bounding boxes to identify each element.
  • Correct boundaries detection is crucial for correct image normalization.

Normalized image

READ

  • Having document boundaries, we can crop the document and normalize it to be a fixed size and position.
  • Such normalization will improve built in Captiva OCR or any other OCR engine in a workflow. DF-Vision can run as a stand-alone component using Tesseract OCR.

Learn fields and read text

  • If ML model #1 achieves high quality in detecting document borders then all required fields will have the same position across all the normalized images, so we may be able to use static template to extract data. Higher OCR results are achieved due to the consistent image position.
  • We use a second ML model to detect each required field on the normalized image.
  • Having field bounding boxes, we can work with each value independently. That provides more freedom in pre-processing and make OCRs role simpler and more accurate.
  • Any OCR engine can then be used to extract data. In the current version of DF-Vision, integration into Captiva OCR is available, if a stand-alone system is required i.e., no Captiva then Google Tesseract is available natively.

Business Hours:
Mon. – Fri. 08:00 – 17:00