Want to see how we can help?Talk to us

BlogRetail BankingUse Case ListicleYuaccess

10 Document Types AI Can Automatically Process for BFSI

A detailed exploration of 10 document types that AI can automatically classify, extract, and verify for Indian banking and financial services — covering what is extracted, accuracy levels, unique challenges, and specific use cases for each document type.

YuVerse Team

Published June 3, 2026 · Updated July 3, 2026 · 17 min read

10 Document Types AI Can Automatically Process for BFSI

Indian banking and financial services generate an extraordinary diversity of documents. Every loan application, account opening, insurance purchase, and trade transaction produces documents that must be read, verified, and converted into structured data. The manual approach costs INR 15-40 per document with 3-5% error rates.

AI-powered document processing now handles the full spectrum. Platforms like YuAccess process over 1 million documents monthly across 100+ types with 99.9% accuracy. This guide covers the 10 most important document types for Indian BFSI — what is extracted, accuracy levels, unique challenges, and specific use cases.

1. Aadhaar Card

What It Is

The Aadhaar card is India's universal identity document, issued by the Unique Identification Authority of India (UIDAI) to over 140 crore residents. It serves as the primary KYC document for virtually all financial service interactions — account opening, loan applications, insurance purchases, and mutual fund investments.

What AI Extracts

Field	Description	Extraction Challenge
Aadhaar Number	12-digit unique identifier	Must handle spacing variations (XXXX XXXX XXXX vs continuous)
Full Name (English)	Applicant name in English	Font and print quality variations across card generations
Full Name (Regional)	Name in state-specific language	Script-specific OCR for 10+ Indian scripts
Date of Birth	DD/MM/YYYY format	Distinguish from other dates on card (issue date)
Gender	Male/Female/Transgender	Standard field with consistent positioning
Address	Full residential address	Multi-line, variable length, mixed scripts
Photograph	Facial image	Extracted for face-match verification
QR Code	Encrypted Aadhaar data	Decoded for verification against printed data

Accuracy Achieved

Field-level accuracy: 99.9%
End-to-end (all fields correct): 99.5%
Verhoeff checksum validation: 100% of extracted numbers verified

Unique Challenges

Multiple formats: Standard card, e-Aadhaar PDF, Aadhaar letter, masked Aadhaar, and m-Aadhaar — each with different layouts. Bilingual content: English and regional language on the same card requiring separate extraction. Lamination/hologram interference: Physical cards create glare and obscure text in photographs. Address inconsistency: No standardised format — mixed scripts, varying detail levels, abbreviated locality names.

BFSI Use Cases

Account opening KYC: Primary identity verification for all financial products
Loan application: Name, address, and DOB extraction for origination
Aadhaar-linked services: DBT, AEPS, and re-KYC updates
Insurance onboarding: Identity verification for policy issuance

2. PAN Card

What It Is

The Permanent Account Number (PAN) card, issued by the Income Tax Department of India, is the universal financial identifier. Required for all transactions above INR 50,000, all income tax filings, and most financial product purchases.

What AI Extracts

Field	Description	Extraction Challenge
PAN Number	10-character alphanumeric (AAAAA9999A)	Character confusion between similar letters/numbers (O/0, I/1, B/8)
Full Name	Applicant name	Multiple name format variations across card generations
Father's Name	Father's/parent's name	Distinguished from applicant name by position/label
Date of Birth	DD/MM/YYYY	Located consistently but varying print quality
Photograph	Facial image	Often small and low-resolution on older cards
Signature	Digital signature image	Extracted for signature verification workflows
Issue Date	Card issuance date	Present on newer format cards

Accuracy Achieved

Field-level accuracy: 99.9%
PAN format validation: 100% (algorithmic format check)
Cross-verification with IT database: Instant API verification available

Unique Challenges

Old vs new format cards: PAN cards issued before 2010 have different layouts and print quality. Character ambiguity: The format (5 letters + 4 digits + 1 letter) makes O/0 and I/1 confusion critical — the AI uses format-aware decoding. Photocopy degradation: PAN cards are frequently submitted as multi-generation photocopies with severely degraded text.

BFSI Use Cases

Income tax linkage: Connecting financial products to tax identity for TDS/TCS reporting
High-value transaction validation: Mandatory for transactions above INR 50,000
Loan application: Income verification cross-referencing (PAN to ITR to income)
Cross-referencing: Linking credit bureau data (PAN-based) with application data

3. Salary Slips

What It Is

Monthly salary slips (or payslips) detail an employee's earnings and deductions for a specific pay period. They are the primary income verification document for salaried loan applicants. Indian lending typically requires the last 3-6 months of salary slips.

What AI Extracts

Field	Description	Extraction Challenge
Employee Name	Payee name	Match against application name
Employee ID	Internal employee identifier	Variable positioning and format
Employer Name	Company name	Sometimes abbreviated or in logo form
Pay Period	Month and year of payment	Multiple date format representations
Basic Salary	Basic component	Label variations ("Basic", "Basic Pay", "Basic Salary")
HRA	House Rent Allowance	May be combined with other allowances
DA/Special Allowance	Various allowances	Highly variable across employers
Gross Salary	Total earnings before deductions	Critical for income computation
PF Deduction	Provident fund contribution	Both employee and employer portions
Professional Tax	State-level professional tax	Not present in all states
TDS	Tax deducted at source	Monthly tax deduction
Net Salary	Take-home pay	Critical for bank statement matching
Bank Account Number	Salary credit account	For bank statement cross-verification

Accuracy Achieved

Field-level accuracy: 99.5% (across 500+ employer formats)
Income computation accuracy: 99.7% (gross and net calculations)
Employer identification: 99.2%

Unique Challenges

Extreme format diversity: No mandated format exists — thousands of employers each design their own layouts, field names, and structures. Computed field verification: AI must verify arithmetic (allowances = gross; gross minus deductions = net) to catch errors and fraud. Format variation: Some slips show 15-20 line items; others show only gross and net. Password-protected PDF salary slips require decryption before processing.

BFSI Use Cases

Income assessment: Primary basis for determining loan eligibility (typically 3x net salary for personal loans)
FOIR calculation: Fixed Obligation to Income Ratio computation
Employer verification: Confirming employment status and employer identity
Income trend analysis: Comparing 3-6 months of slips for stability assessment
Obligation identification: Existing PF loans or salary advances visible in deductions

4. Income Tax Returns (ITR)

What It Is

Annual income tax returns filed with the Income Tax Department, detailing total income, deductions claimed, tax paid, and tax liability. ITR forms range from ITR-1 (simple salaried) to ITR-7 (trusts and institutions). For lending, ITR serves as the authoritative income document — especially for self-employed and business applicants.

What AI Extracts

Field	Description	Extraction Challenge
PAN of Assessee	Taxpayer PAN	Linkage field for cross-verification
Assessment Year	AY for the return	Distinguish from financial year
Filing Date	Date of filing with IT department	Confirms timely filing
Total Income	Gross total income before deductions	Critical lending metric
Income from Salary	Salary head income	For salaried applicants
Income from Business/Profession	Business income	For self-employed/business applicants
Income from House Property	Rental/property income	Additional income source
Income from Capital Gains	Investment gains	Supplementary income
Deductions (80C, 80D, etc.)	Tax deductions claimed	Indicates existing commitments
Tax Payable	Total tax liability	Cross-checks with Form 26AS
Verification Status	Whether verified (e-verified/ITR-V)	Confirms valid filing

Accuracy Achieved

Field-level accuracy: 99.8% (structured government form)
Income computation verification: 99.9% (arithmetic validation)
Form type identification: 99.7% (ITR-1 through ITR-7)

Unique Challenges

Form complexity variation: ITR-1 is 2 pages; ITR-3 can exceed 30 pages; ITR-6 may be 50+ pages. The AI navigates form-specific structures for each type. Annual format changes: The Income Tax Department revises forms yearly, changing field positions and labels. Acknowledgment vs full return: Customers submit either the 1-page ITR-V or complete returns — both must be handled. Format diversity: Documents arrive as digital PDFs, printed scans, or photographs with varying accuracy.

BFSI Use Cases

Self-employed income verification: Primary income document for non-salaried applicants
Income trend assessment: 2-3 year ITR comparison for income stability
Tax compliance check: Confirms applicant files taxes (compliance indicator)
Business viability assessment: Business income trends for SME lending
Cross-verification: ITR income matched against bank credits and Form 16 figures

5. Bank Statements

What It Is

Monthly or periodic account statements from banks showing all transactions — credits, debits, balances, and account details. Bank statements are arguably the most information-rich document in lending — revealing income patterns, spending behaviour, existing obligations, and financial discipline.

What AI Extracts

Field	Description	Extraction Challenge
Account Holder Name	Customer name on account	Match against application
Account Number	Bank account number	Variable length across banks
Bank Name and Branch	Issuing bank details	For cross-referencing
Statement Period	From-to dates	Confirm coverage period
Opening Balance	Balance at period start	Critical for continuity check
Closing Balance	Balance at period end	Cross-check with next month's opening
All Transactions	Date, description, amount, balance	Table extraction from diverse formats
Salary Credits	Regular income deposits	Identified by pattern/employer name
EMI Debits	Existing loan obligations	Identified by narration patterns
Cheque Bounces	Return entries	Critical negative indicator
Average Monthly Balance	Computed metric	Calculated from extracted data
Cash Deposits	Cash credit entries	Risk indicator for anti-money laundering

Accuracy Achieved

Transaction extraction accuracy: 99.7% (across 50+ bank formats)
Categorisation accuracy: 97-99% (salary, EMI, utility, cash, transfer)
Balance reconciliation: 99.9% (opening + credits - debits = closing)

Unique Challenges

Format diversity: 50+ Indian banks each have unique formats with different column orders, date formats, and narration styles. Multi-page continuity: 6-12 months can span 20-50 pages with table structure maintained across page breaks. Narration parsing: Abbreviated transaction descriptions ("NEFT-AXIS-TCSLTD-SALARY-MAR26") must be interpreted and categorised. Period coverage: AI verifies continuous coverage without missing months. Input variety: Statements arrive as digital PDFs, scanned printouts, or passbook photographs.

BFSI Use Cases

Income verification: Salary credit identification and averaging
Obligation mapping: EMI debit identification for FOIR calculation
Cash flow analysis: Monthly inflow/outflow patterns for business lending
Banking behaviour assessment: Bounce history, minimum balance maintenance, account activity
Fraud detection: Unusual patterns, circular transactions, sudden large deposits before application

6. Property Documents

What It Is

Property documents encompass sale deeds, title deeds, encumbrance certificates, property tax receipts, khata certificates, and mutation records. These are critical for home loans, loan against property (LAP), and any collateral-backed lending.

What AI Extracts

Field	Description	Extraction Challenge
Property Owner Name(s)	Current legal owner	Multiple owners, inherited properties
Property Address/Survey Number	Location identification	Non-standardised Indian addressing
Property Type	Residential/commercial/agricultural	Classification from description
Area/Measurement	Built-up, carpet, plot area	Multiple measurement systems (sq ft, sq m, cents, guntas)
Registration Number	Document registration reference	State-specific format
Registration Date	Date of deed registration	For ownership timeline
Sale Consideration	Transaction value	For collateral valuation
Stamp Duty Paid	Government stamp duty	Validates registration legitimacy
Encumbrance Status	Existing charges/mortgages	Critical for lending
Previous Owners	Chain of title	For title clarity assessment

Accuracy Achieved

Field-level accuracy: 98.5% (lower than identity documents due to complexity)
Owner name extraction: 99.0%
Area/measurement extraction: 97.5%
Registration details: 99.2%

Unique Challenges

Handwritten content: Older documents contain significant handwritten portions — boundaries, amounts, and witness details. State-specific formats: Documentation varies enormously across states (Marathi in Maharashtra, Tamil in Tamil Nadu, Malayalam in Kerala — each with different legal structures). Legal terminology: Archaic terms ("mesne profits," "patta," "khata") require domain-trained NER. Multi-page complexity: Single files may span 10-20 pages. Document age: Documents dating back 30-50 years may have faded ink and deteriorated paper.

BFSI Use Cases

Home loan origination: Property identification, valuation basis, and ownership verification
LAP (Loan Against Property): Collateral identification and existing encumbrance check
Title verification: Chain of title clarity for legal assessment
Stamp duty cross-check: Validates declared property value against market rates
Re-mortgage processing: Existing property charge information for refinancing

7. Insurance Policies

What It Is

Insurance policy documents — life, health, motor, and general — contain coverage details, premium information, nominee details, and terms. These serve multiple BFSI purposes from collateral assignment to risk assessment.

What AI Extracts

Key fields include: policy number, policyholder name, insurer name, policy type (term/endowment/ULIP/health), sum assured, premium amount and frequency, start and maturity dates, nominee details, surrender value, and rider details.

Accuracy Achieved

Field-level accuracy: 98.8%
Policy type classification: 99.3%
Financial figure extraction: 99.5%

Unique Challenges

Each of India's 50+ insurance companies uses proprietary formats that change over years. Policy documents contain dense legal text requiring NLP to identify key commercial terms. Multiple riders add complexity. Physical policy bonds printed on watermarked paper with security features complicate OCR.

BFSI Use Cases

Loan against policy: Determining surrender value and assignment eligibility
Premium obligation assessment: Existing premium commitments count toward FOIR
Insurance bundling in lending: Verifying credit life insurance assignment for home loans
Nominee/beneficiary verification: Cross-checking against loan documentation

8. Trade Finance Documents

What It Is

Trade finance documents include bills of lading, letters of credit, invoices, shipping manifests, certificates of origin, and packing lists — forming the documentary backbone of international and domestic trade finance for Indian banks.

What AI Extracts

Key fields include: LC number, beneficiary and applicant names, goods description, amount/currency, ports of loading and discharge, vessel details, shipping dates, document expiry, HS codes, and Incoterms (FOB, CIF, etc.).

Accuracy Achieved

Field-level accuracy: 98.0-99.2% (varies by document sub-type)
Amount and currency extraction: 99.5%
Date extraction: 99.3%

Unique Challenges

Trade documents originate from countries worldwide in multiple languages with different national formats. They frequently carry handwritten endorsements and stamps added during the trade lifecycle. Multi-party complexity (buyer, seller, shipping line, customs, multiple banks) requires precise attribution. UCP 600 compliance rules demand extraction accuracy where a misspelled beneficiary name can invalidate an entire letter of credit.

BFSI Use Cases

LC document examination: Automated checking of presented documents against LC terms
Compliance screening: Extracted party names checked against sanctions lists
Risk assessment: Trade value, route, and goods classification for risk pricing
Receivables financing: Invoice data extraction for supply chain finance

9. Corporate Financial Statements

What It Is

Corporate financial documents include audited balance sheets, profit and loss statements, cash flow statements, directors' reports, and auditor reports — essential for business lending, SME finance, and corporate banking.

What AI Extracts

Key fields include: company name and CIN, financial year, total revenue/turnover, net profit/loss, total assets and liabilities, net worth, current ratio, debt-equity ratio, cash flow from operations, contingent liabilities, and related party transactions.

Accuracy Achieved

Primary financial figure extraction: 99.0-99.5%
Ratio computation accuracy: 99.7% (computed from verified extracted figures)
Schedule and note extraction: 96-98%

Unique Challenges

Companies present financials in widely varying formats. Multi-year comparative data requires correct column-year association. Key lending information (contingent liabilities, related party transactions) often resides in notes with no standardised format. Auditor qualifications and going concern observations require NLP to extract and flag. The AI must also distinguish consolidated from standalone financials.

BFSI Use Cases

Business loan assessment: Revenue, profitability, and leverage analysis for SME lending
Corporate credit: Financial health evaluation for term loans and working capital
Covenant monitoring: Automated tracking of financial ratios against loan covenants
Annual review: Periodic credit review using latest financial statements

10. Utility Bills

What It Is

Utility bills — electricity, gas, water, telephone/broadband, and mobile postpaid — serve primarily as address proof documents in Indian BFSI KYC, confirming residence at a particular address.

What AI Extracts

Key fields include: consumer name, service address, consumer/account number, bill date, bill amount, service provider, connection type (residential/commercial), and payment status.

Accuracy Achieved

Field-level accuracy: 99.0-99.5%
Address extraction: 98.5% (Indian address complexity)
Date and amount extraction: 99.7%

Unique Challenges

India has hundreds of electricity boards, gas distributors, and water authorities — each with unique formats. Utility bills are often primarily in regional languages. Address matching between utility bills and Aadhaar requires fuzzy matching for formatting variations. Recency validation (within 3 months for KYC) requires date extraction and comparison. Thermal-printed bills on thin paper create photography and fading challenges.

BFSI Use Cases

Address proof for KYC: Primary or secondary address verification document
Address matching: Cross-verification between Aadhaar address and utility bill address
Residence stability: Duration of utility connection indicates residence tenure
Alternative data: Utility payment history as creditworthiness signal for thin-file customers

Comparative Accuracy and Processing Summary

Document Type	Accuracy	Processing Time	Primary Challenge	Volume in Lending
Aadhaar	99.9%	2-3 seconds	Multiple formats, bilingual	Very High
PAN	99.9%	1-2 seconds	Character confusion, old formats	Very High
Salary Slips	99.5%	3-5 seconds	Format diversity (thousands)	High
ITR	99.8%	5-8 seconds	Annual format changes, form complexity	High
Bank Statements	99.7%	15-30 seconds	Multi-page tables, narration parsing	Very High
Property Documents	98.5%	8-15 seconds	Handwriting, state-specific, legal terms	Medium
Insurance Policies	98.8%	5-10 seconds	Insurer-specific, dense legal text	Medium
Trade Finance	98.0-99.2%	5-12 seconds	International diversity, endorsements	Medium (trade banks)
Corporate Financials	99.0-99.5%	20-45 seconds	Complex tables, notes extraction	Medium
Utility Bills	99.0-99.5%	2-4 seconds	Regional language, format diversity	High

How AI Handles the Full Document Stack

Unified Processing Pipeline

Modern document AI processes all 10 types through a unified pipeline: single upload point with automatic classification (under 200ms), type-specific extraction model activation, universal validation rules (format checks, checksums, date validation), cross-document consistency verification, and unified structured output regardless of source document type.

The Cross-Document Advantage

Processing multiple document types within a single platform enables cross-verification: names must match across Aadhaar, PAN, salary slips, and bank statements; employer details must align between salary slips, Form 16, and bank credits; income must be consistent between salary slips, ITR, and bank statements; and property values must be reasonable for the locality. This catches fraud and errors that single-document processing misses entirely.

Frequently Asked Questions

Can document AI handle documents it has never seen before?

Yes, through transfer learning. Foundational models understand document structure broadly and can perform basic extraction (85-90% accuracy) on unseen formats. With 50-100 labelled examples, accuracy reaches production thresholds (98%+). Most platforms onboard new formats within 1-2 weeks.

How does the system handle documents with both printed and handwritten content?

The system separates text regions by type using a segmentation model, routing each to the appropriate recognition pipeline. Printed text achieves higher accuracy; handwritten text achieves 92-97%. Confidence scores allow reviewers to focus on lower-confidence handwritten extractions.

Is document AI useful for documents already in digital/text-based PDF format?

Yes — processing is faster and more accurate. Text-based PDFs skip the OCR step entirely, eliminating recognition errors. The AI still adds value through field identification, validation, cross-document verification, and structured data output.

What volume can a single document AI platform handle?

Cloud-based platforms scale to millions of documents per month with parallelised processing. YuAccess processes over 1 million documents monthly and handles 10,000+ concurrent requests without degradation.

How do I prioritise which document types to automate first?

Prioritise by volume, TAT impact, and error rate: (1) Identity documents — highest volume, quick wins. (2) Bank statements — high processing time. (3) Salary slips — diverse formats benefit most from AI. (4) ITR — complex forms with common manual errors. (5) Property/trade/corporate docs — lower volume but high per-document cost.

Conclusion

AI document processing has moved beyond basic OCR into genuine document intelligence — understanding context, computing lending metrics, verifying consistency, and detecting anomalies across the full spectrum of Indian BFSI documents. The 10 document types covered in this guide represent 90%+ of the document volume in Indian lending and banking operations.

YuAccess supports all 10 of these document types — and 90+ additional types — through a unified processing platform. With 99.9% extraction accuracy on standard documents, support for 10+ Indian languages, cross-document verification, and lending-specific computation (income, FOIR, obligations), the platform handles the complete document intelligence requirement for Indian BFSI institutions.

10 Document Types AI Can Automatically Process for BFSI

1. Aadhaar Card

What It Is

What AI Extracts

Field	Description	Extraction Challenge
Aadhaar Number	12-digit unique identifier	Must handle spacing variations (XXXX XXXX XXXX vs continuous)
Full Name (English)	Applicant name in English	Font and print quality variations across card generations
Full Name (Regional)	Name in state-specific language	Script-specific OCR for 10+ Indian scripts
Date of Birth	DD/MM/YYYY format	Distinguish from other dates on card (issue date)
Gender	Male/Female/Transgender	Standard field with consistent positioning
Address	Full residential address	Multi-line, variable length, mixed scripts
Photograph	Facial image	Extracted for face-match verification
QR Code	Encrypted Aadhaar data	Decoded for verification against printed data

Accuracy Achieved

Field-level accuracy: 99.9%
End-to-end (all fields correct): 99.5%
Verhoeff checksum validation: 100% of extracted numbers verified

Unique Challenges

BFSI Use Cases

Account opening KYC: Primary identity verification for all financial products
Loan application: Name, address, and DOB extraction for origination
Aadhaar-linked services: DBT, AEPS, and re-KYC updates
Insurance onboarding: Identity verification for policy issuance

2. PAN Card

What It Is

What AI Extracts

Field	Description	Extraction Challenge
PAN Number	10-character alphanumeric (AAAAA9999A)	Character confusion between similar letters/numbers (O/0, I/1, B/8)
Full Name	Applicant name	Multiple name format variations across card generations
Father's Name	Father's/parent's name	Distinguished from applicant name by position/label
Date of Birth	DD/MM/YYYY	Located consistently but varying print quality
Photograph	Facial image	Often small and low-resolution on older cards
Signature	Digital signature image	Extracted for signature verification workflows
Issue Date	Card issuance date	Present on newer format cards

Accuracy Achieved

Field-level accuracy: 99.9%
PAN format validation: 100% (algorithmic format check)
Cross-verification with IT database: Instant API verification available

Unique Challenges

BFSI Use Cases

Income tax linkage: Connecting financial products to tax identity for TDS/TCS reporting
High-value transaction validation: Mandatory for transactions above INR 50,000
Loan application: Income verification cross-referencing (PAN to ITR to income)
Cross-referencing: Linking credit bureau data (PAN-based) with application data

3. Salary Slips

What It Is

What AI Extracts

Field	Description	Extraction Challenge
Employee Name	Payee name	Match against application name
Employee ID	Internal employee identifier	Variable positioning and format
Employer Name	Company name	Sometimes abbreviated or in logo form
Pay Period	Month and year of payment	Multiple date format representations
Basic Salary	Basic component	Label variations ("Basic", "Basic Pay", "Basic Salary")
HRA	House Rent Allowance	May be combined with other allowances
DA/Special Allowance	Various allowances	Highly variable across employers
Gross Salary	Total earnings before deductions	Critical for income computation
PF Deduction	Provident fund contribution	Both employee and employer portions
Professional Tax	State-level professional tax	Not present in all states
TDS	Tax deducted at source	Monthly tax deduction
Net Salary	Take-home pay	Critical for bank statement matching
Bank Account Number	Salary credit account	For bank statement cross-verification

Accuracy Achieved

Field-level accuracy: 99.5% (across 500+ employer formats)
Income computation accuracy: 99.7% (gross and net calculations)
Employer identification: 99.2%

Unique Challenges

BFSI Use Cases

Income assessment: Primary basis for determining loan eligibility (typically 3x net salary for personal loans)
FOIR calculation: Fixed Obligation to Income Ratio computation
Employer verification: Confirming employment status and employer identity
Income trend analysis: Comparing 3-6 months of slips for stability assessment
Obligation identification: Existing PF loans or salary advances visible in deductions

4. Income Tax Returns (ITR)

What It Is

What AI Extracts

Field	Description	Extraction Challenge
PAN of Assessee	Taxpayer PAN	Linkage field for cross-verification
Assessment Year	AY for the return	Distinguish from financial year
Filing Date	Date of filing with IT department	Confirms timely filing
Total Income	Gross total income before deductions	Critical lending metric
Income from Salary	Salary head income	For salaried applicants
Income from Business/Profession	Business income	For self-employed/business applicants
Income from House Property	Rental/property income	Additional income source
Income from Capital Gains	Investment gains	Supplementary income
Deductions (80C, 80D, etc.)	Tax deductions claimed	Indicates existing commitments
Tax Payable	Total tax liability	Cross-checks with Form 26AS
Verification Status	Whether verified (e-verified/ITR-V)	Confirms valid filing

Accuracy Achieved

Field-level accuracy: 99.8% (structured government form)
Income computation verification: 99.9% (arithmetic validation)
Form type identification: 99.7% (ITR-1 through ITR-7)

Unique Challenges

BFSI Use Cases

Self-employed income verification: Primary income document for non-salaried applicants
Income trend assessment: 2-3 year ITR comparison for income stability
Tax compliance check: Confirms applicant files taxes (compliance indicator)
Business viability assessment: Business income trends for SME lending
Cross-verification: ITR income matched against bank credits and Form 16 figures

5. Bank Statements

What It Is

What AI Extracts

Field	Description	Extraction Challenge
Account Holder Name	Customer name on account	Match against application
Account Number	Bank account number	Variable length across banks
Bank Name and Branch	Issuing bank details	For cross-referencing
Statement Period	From-to dates	Confirm coverage period
Opening Balance	Balance at period start	Critical for continuity check
Closing Balance	Balance at period end	Cross-check with next month's opening
All Transactions	Date, description, amount, balance	Table extraction from diverse formats
Salary Credits	Regular income deposits	Identified by pattern/employer name
EMI Debits	Existing loan obligations	Identified by narration patterns
Cheque Bounces	Return entries	Critical negative indicator
Average Monthly Balance	Computed metric	Calculated from extracted data
Cash Deposits	Cash credit entries	Risk indicator for anti-money laundering

Accuracy Achieved

Transaction extraction accuracy: 99.7% (across 50+ bank formats)
Categorisation accuracy: 97-99% (salary, EMI, utility, cash, transfer)
Balance reconciliation: 99.9% (opening + credits - debits = closing)

Unique Challenges

BFSI Use Cases

Income verification: Salary credit identification and averaging
Obligation mapping: EMI debit identification for FOIR calculation
Cash flow analysis: Monthly inflow/outflow patterns for business lending
Banking behaviour assessment: Bounce history, minimum balance maintenance, account activity
Fraud detection: Unusual patterns, circular transactions, sudden large deposits before application

6. Property Documents

What It Is

What AI Extracts

Field	Description	Extraction Challenge
Property Owner Name(s)	Current legal owner	Multiple owners, inherited properties
Property Address/Survey Number	Location identification	Non-standardised Indian addressing
Property Type	Residential/commercial/agricultural	Classification from description
Area/Measurement	Built-up, carpet, plot area	Multiple measurement systems (sq ft, sq m, cents, guntas)
Registration Number	Document registration reference	State-specific format
Registration Date	Date of deed registration	For ownership timeline
Sale Consideration	Transaction value	For collateral valuation
Stamp Duty Paid	Government stamp duty	Validates registration legitimacy
Encumbrance Status	Existing charges/mortgages	Critical for lending
Previous Owners	Chain of title	For title clarity assessment

Accuracy Achieved

Field-level accuracy: 98.5% (lower than identity documents due to complexity)
Owner name extraction: 99.0%
Area/measurement extraction: 97.5%
Registration details: 99.2%

Unique Challenges

BFSI Use Cases

Home loan origination: Property identification, valuation basis, and ownership verification
LAP (Loan Against Property): Collateral identification and existing encumbrance check
Title verification: Chain of title clarity for legal assessment
Stamp duty cross-check: Validates declared property value against market rates
Re-mortgage processing: Existing property charge information for refinancing

7. Insurance Policies

What It Is

What AI Extracts

Accuracy Achieved

Field-level accuracy: 98.8%
Policy type classification: 99.3%
Financial figure extraction: 99.5%

Unique Challenges

BFSI Use Cases

Loan against policy: Determining surrender value and assignment eligibility
Premium obligation assessment: Existing premium commitments count toward FOIR
Insurance bundling in lending: Verifying credit life insurance assignment for home loans
Nominee/beneficiary verification: Cross-checking against loan documentation

8. Trade Finance Documents

What It Is

What AI Extracts

Accuracy Achieved

Field-level accuracy: 98.0-99.2% (varies by document sub-type)
Amount and currency extraction: 99.5%
Date extraction: 99.3%

Unique Challenges

BFSI Use Cases

LC document examination: Automated checking of presented documents against LC terms
Compliance screening: Extracted party names checked against sanctions lists
Risk assessment: Trade value, route, and goods classification for risk pricing
Receivables financing: Invoice data extraction for supply chain finance

9. Corporate Financial Statements

What It Is

What AI Extracts

Accuracy Achieved

Primary financial figure extraction: 99.0-99.5%
Ratio computation accuracy: 99.7% (computed from verified extracted figures)
Schedule and note extraction: 96-98%

Unique Challenges

BFSI Use Cases

Business loan assessment: Revenue, profitability, and leverage analysis for SME lending
Corporate credit: Financial health evaluation for term loans and working capital
Covenant monitoring: Automated tracking of financial ratios against loan covenants
Annual review: Periodic credit review using latest financial statements

10. Utility Bills

What It Is

Utility bills — electricity, gas, water, telephone/broadband, and mobile postpaid — serve primarily as address proof documents in Indian BFSI KYC, confirming residence at a particular address.

What AI Extracts

Key fields include: consumer name, service address, consumer/account number, bill date, bill amount, service provider, connection type (residential/commercial), and payment status.

Accuracy Achieved

Field-level accuracy: 99.0-99.5%
Address extraction: 98.5% (Indian address complexity)
Date and amount extraction: 99.7%

Unique Challenges

BFSI Use Cases

Address proof for KYC: Primary or secondary address verification document
Address matching: Cross-verification between Aadhaar address and utility bill address
Residence stability: Duration of utility connection indicates residence tenure
Alternative data: Utility payment history as creditworthiness signal for thin-file customers

Comparative Accuracy and Processing Summary

Document Type	Accuracy	Processing Time	Primary Challenge	Volume in Lending
Aadhaar	99.9%	2-3 seconds	Multiple formats, bilingual	Very High
PAN	99.9%	1-2 seconds	Character confusion, old formats	Very High
Salary Slips	99.5%	3-5 seconds	Format diversity (thousands)	High
ITR	99.8%	5-8 seconds	Annual format changes, form complexity	High
Bank Statements	99.7%	15-30 seconds	Multi-page tables, narration parsing	Very High
Property Documents	98.5%	8-15 seconds	Handwriting, state-specific, legal terms	Medium
Insurance Policies	98.8%	5-10 seconds	Insurer-specific, dense legal text	Medium
Trade Finance	98.0-99.2%	5-12 seconds	International diversity, endorsements	Medium (trade banks)
Corporate Financials	99.0-99.5%	20-45 seconds	Complex tables, notes extraction	Medium
Utility Bills	99.0-99.5%	2-4 seconds	Regional language, format diversity	High