04 Text Recognition

AI-powered Text Recognition OCR for fast
and accurate
extraction.

Extract text, verify identity documents, and automate onboarding workflows using intelligent Text Recognition OCR designed for financial and regulated industries.

99%
Field-level accuracy
<1s
Response time
15+
Indian document types
1 wk
Time to integrate
Live-document-reading-1-1.webp
What is Text Recognition OCR

Pre-Fill Forms Instantly with OCR

Our AI-powered Text Recognition OCR (Optical Character Recognition) automatically extracts information from documents such as PAN cards, Aadhaar cards, and cheques. Customers simply upload a photo, and the required details, including name, date of birth, document number, address, and more, are instantly captured and used to pre-fill forms.

Customers only need to review and confirm the information. The automated OCR eliminates manual effort, speeding up onboarding and reducing customer drop offs occur during form filing.

OCR overview
Indian documents supported

Supports all OVDs (Officially Valid Documents)

One API endpoint. Pass the document type and the image. Get structured JSON back. Built for the documents Indian regulators ask for.

PAN card

name | father_name | dob | pan_number

Aadhaar card

name | dob | gender | address | uid_masked

GST certificate

gstin | legal_name | trade_name | address

Cheque

account_number | ifsc | bank | account_holder

Passport

passport_no | name | dob | issue | expiry

Driving licence

dl_number | name | dob | validity | vehicle_class

Voter ID

epic_no | name | father_name | dob

Vehicle RC

registration_no | owner | model | engine_no

How OCR works

Upload. Extract. Autofill.

Send a document image or PDF and receive structured data in seconds. OCR automatically captures key details and returns them in a format your system can use instantly.

1

Upload the document

Upload a photo, scanned image, or PDF to the OCR API. The system automatically improves image quality, corrects orientation, and prepares the document for extraction.

2

Extract key information

OCR identifies and extracts important fields such as name, document number, date of birth, address, and more. Each field is returned separately with a confidence score.

3

Receive structured data

Get clean JSON output that can be used to pre-fill forms, verify information, or send data directly into your internal systems. No manual data entry or document parsing required.

surepass-ocr.js
Node Python cURL
// Send the PAN image, get structured JSON back<br />
const response = await fetch(<br />
  'https://api.surepass.io/v1/ocr/pan',<br />
  {<br />
    method: 'POST',<br />
    headers: { 'Authorization': `Bearer ${API_KEY}` },<br />
    body: formData // image attached<br />
  }<br />
);<br />
<br />
const { data } = await response.json();<br />
// {<br />
//   name: "Priya Sharma",<br />
//   father_name: "Ramesh Sharma",<br />
//   dob: "1992-08-14",<br />
//   pan_number: "ABCPS1234K",<br />
//   confidence: 0.994<br />
// }
Where to use Text Recognition OCR

Where do you want to eliminate manual data entry

Customer Onboarding & KYC

Customer Onboarding & KYC

When a user uploads PAN or Aadhaar the system automatically pre-fills the KYC form. This reduces manual typing and makes onboarding faster and smoother.

Loan Origination & Underwriting

Loan Origination & Underwriting

Extracts data from PAN Aadhaar bank statements GST certificates and cheques in a single flow. Your credit file is structured from the start so analysts do not need to manually read PDFs.

Back-Office Digitisation

Back-Office Digitisation

Converts old loan files into structured data at scale. Paper KYC scanned forms and legacy records become searchable and easy to use.

Features

Everything you get with OCR.

All in the API today.

99%+ field accuracyOn standard Indian docs
Sub-second responsep99 under 1.2s
Structured JSONField by field, not blob
Confidence scoresPer field, not per doc
Auto-rotation & de-skewTilted phone photos OK
Image quality checkBlur, glare, crop detection
PDF & image supportJPG, PNG, PDF up to 10MB
Bulk endpointFor back-office migrations
Indian doc layoutsTrained on real samples
DPDP & ISO 27001Indian-server processing
Auto-purgeConfigurable retention
REST API & SDKsNode · Python · Java
Frequently asked questions

OCR API - common questions.

What product, engineering, and compliance teams ask before integrating.

OCR https://en.wikipedia.org/wiki/Optical_character_recognition
is software that reads text from an image or PDF and converts it into structured digital data. For Indian KYC, it extracts fields like name, date of birth, document number, and address from documents such as PAN, Aadhaar, GST certificate, cheque, passport, driving licence, and voter ID, and returns them as JSON ready to use in your application

PAN card, Aadhaar card, GST certificate, cancelled cheque, passport, driving licence, voter ID, vehicle RC, electricity bill, and more. The same API endpoint handles all of them. You simply pass the document type and image and get structured data in return.

Above 99 percent on standard Indian KYC documents captured in good lighting conditions. The API also returns a confidence score for each field so your system can decide whether to auto accept the data request a retake or send it for manual review.

Yes. it works with phone photos document scans and PDFs. Built in auto rotation de skew and image quality checks ensure results are reliable even without perfect scans. It is designed for real world mobile capture conditions in India

Sub second response time for a single document under normal load. Bulk processing is also supported for high volume use cases like loan portfolio digitisation and back office document migration.

Yes. The OCR API meets DPDP 2023 requirements and is part of the platform that is RBI V-CIP and ISO 27001 certified. Document data is processed on Indian servers and auto-purged per your retention policy.

Book a live demo

Upload any OVD image and see instant JSON output.

Test it with real world documents including blurry photos low quality scans and difficult images that usually fail in other OCR systems. Get structured data in real time with high accuracy and zero manual effort.

Book A Demo