FICA Proof of ID Processing

Datafinity ID Vision (DFVision)

AI for Intelligent proof of ID Capture.

What is DFVision

  • DFVision is an AI-driven software component that enables intelligent capture of proof of ID document types.
  • Datafinity have incorporated Machine\Deep Learning and Computer Vision tools to assist with building a Neural Network for RSA Proof of ID documents types.
  • DFVISION Neural Network contains trained models that can classify and read data from proof of ID documents like RSA ID books, smart cards, driver’s license, and passports.

How it Works - Overview

Step 1

Gather document samples for machine learning phase.
For best results, a minimum of 1000 images are required, with varying qualities, sizes, and orientations.

Step 2

Use machine learning and computer vision tools to learn documents formats.
Generate data sets per document and indexing type.

Step 3

Build Neural Network based on machine learnt models for document classification and reading automation.

Step 4

Generate container with all models in Neural Network (DFVision)!
Service ready for integration with Captiva or any document capture system.

How it Works - Steps

Input
Images

ML MODEL

  • Input images goes “as is”, without any preprocessing or clean up.
  • Sample images should contain varying qualities, orientations, skewness, and sizes.

Learn image category & boundaries

NORMALIZE

  • Annotate each image by drawing bounding boxes to identify each element.
  • Correct boundaries detection is crucial for correct image normalization.

Normalized
image

READ

  • Having document boundaries, we can crop the document and normalize it to be a fixed size and position.
  • Such normalization will improve built in Captiva OCR or any other OCR engine in a workflow. DF-Vision can run as a stand-alone component using Tesseract OCR.

Learn fields and read text

READ

  • If ML model #1 achieves high quality in detecting document borders then all required fields will have the same position across all the normalized images, so we may be able to use static template to extract data. Higher OCR results are achieved due to the consistent image position.
  • We use a second ML model to detect each required field on the normalized image.
  • Having field bounding boxes, we can work with each value independently. That provides more freedom in pre-processing and make OCRs role simpler and more accurate.
  • Any OCR engine can then be used to extract data. In the current version of DF-Vision, integration into Captiva OCR is available, if a stand-alone system is required i.e., no Captiva then Google Tesseract is available natively.