Document AI

Parse, extract, and process structured data from documents.

Tool
Category
Segment
Platform / Tool
Plan
Monthly Price USD
Pricing Model
Free Tier / Trial
Included Usage / Limits
Document / OCR Capabilities
File Types / Modalities
Extraction / RAG / LLM Features
API / SDK / Integrations
Storage / Retention
Deployment / Hosting
Team / Governance
Best Fit
Main Limits / Caveats
No tagline
Document AIAI-native document parsingLlamaParse / LlamaCloudFree$0Credit-based monthly cloud planOfficial homepage states 10,000 free credits/month, roughly 1,000 pages, with agentic OCR, schema extraction and document agentsAgentic OCR and document parsing optimized for AI/RAG50+ unstructured file types, including complex PDFs, images and handwritten notes per product pageParse, Extract, Split, Classify and Index features for document agentsLlamaParse API/client and LlamaIndex/LlamaCloud integrationsCloud handling under LlamaIndex account; privacy/docs per LlamaCloudLlamaCloud hosted serviceFree account; enterprise for SSO/VPC/hybridDevelopers building RAG over complex PDFs without implementing parsersCredit/page cost varies by parsing mode; official detailed pricing can be hard to inspect without account
No tagline
Document AICloud document OCR / IDPGoogle Cloud Document AIEnterprise Document OCR Processor$1.50/1K pages first 5M; $0.60/1K pages afterPer-page processor pricingNew Google Cloud customers get $300 cloud creditEnterprise OCR Processor: $1.50 per 1,000 pages for 1-5M pages/month; OCR add-ons $6 per 1,000 pagesOCR, handwriting, layout-aware document text digitizationPDF, images and supported processor input formats; page counted by file/page rulesOCR output can feed Document AI Layout Parser, Vertex AI Search, RAG and downstream extractionGoogle Cloud APIs, client libraries, Workbench/console and processor endpointsGoogle Cloud data handling, processors and storage depend on project configurationGoogle Cloud managed serviceIAM, audit logging, VPC/security controls through Google CloudTeams already on GCP needing scalable OCR for documentsNo permanent Document AI free tier was visible on pricing page; other Google Cloud services may add costs
No tagline
Document AICloud document OCR / IDPGoogle Cloud Document AILayout Parser / Form Parser / Custom Extractor$10-$30/1K pagesPer-page processor pricingNew Google Cloud customers get $300 cloud creditLayout Parser $10/1K pages; Custom Extractor and Form Parser $30/1K pages for first 1M pages/monthLayout parsing, form parsing, custom entity extraction and specialized processorsDocument pages, forms, PDFs and supported document formatsLayout Parser includes initial chunking; extractors produce structured entities for RAG/workflowsGoogle Cloud API and processor endpointsCloud project storage/retention depends on processor and app setupGoogle Cloud managed processorsIAM, service accounts, audit logs and enterprise controlsStructured extraction from forms, invoices, contracts or domain documentsPer-page extractor cost is much higher than simple OCR; custom processor deployment can add operational cost
No tagline
Document AICloud document OCR / IDPAmazon TextractFree tier$0 for first 3 months within quotaAWS Free Tier page quotasYes, for new AWS customers3 months: Detect Document Text 1,000 pages/month; Analyze Document 100 pages/month for Forms/Tables/Layout and query combos; Expense and ID 100 pages/month; Lending 2,000 pages/monthOCR, handwriting, tables, forms, signatures, queries, expenses, IDs and lending docsDocuments and images supported by Textract APIsStructured extraction from tables/forms/queries; expense and ID-specific extractionAWS SDKs, CLI, APIs, Lambda/S3/event workflowsData storage depends on app/S3 usage; async APIs use job outputsAWS managed service by regionIAM, CloudTrail, VPC endpoints where available and AWS compliance controlsAWS teams testing OCR and IDP without upfront spendFree tier lasts only 3 months and excludes Custom Queries
No tagline
Document AICloud document OCR / IDPAmazon TextractDetect Document Text$1.50/1K pages first 1M in US West examplePer-page pay-as-you-goFree tier exists for first 3 monthsOfficial pricing example: $0.0015/page for first 1M pages in US West (Oregon); $0.0006/page after 1M in exampleOCR text and handwriting extractionDocuments and images supported by TextractRaw text extraction for downstream search/RAGAWS SDKs, CLI and APIsApplication controlled; output can be stored in S3AWS managed regional serviceAWS IAM and account governanceHigh-volume OCR where AWS integration mattersPricing is regional and feature-specific; examples are not a substitute for calculator
No tagline
Document AICloud document OCR / IDPAmazon TextractAnalyze Document / Expense / ID / LendingUsage-based by featurePer-page feature pricingFree tier exists for first 3 monthsExamples: Tables $0.015/page, Forms $0.05/page, Queries $0.015/page, Expense $0.01/page, ID $0.025/page, Lending $0.07/page in US West examplesTables, forms, queries, expense, ID and lending extractionForms, tax docs, invoices, IDs, mortgage/lending docs and other document images/PDFsFeature-specific structured extraction with OCR included in Analyze outputAWS APIs and SDKs; integrates with S3/Lambda/Step FunctionsOutput retention and storage controlled by application/AWS resourcesAWS managed regional serviceAWS IAM, monitoring and enterprise account controlsProduction IDP workflows on AWSFeature combinations can get expensive quickly; Custom Queries has no free tier
No tagline
Document AICloud document OCR / IDPAzure AI Document IntelligenceFree F0$0Free monthly page quotaF0 supports all Document Intelligence features for testing, with 0-500 pages free per month on pricing pageRead, Layout, prebuilt, custom classification/extraction and add-ons depending feature availabilityDocuments/images accepted by Azure Document Intelligence APIs; page-based billingPrebuilt models for documents, receipts, invoices, ID, tax forms, contracts and query/add-on featuresREST API, SDKs, Document Intelligence Studio and Azure integrationsAzure service data handling and region controlsAzure managed service; container option shown on pricing pageAzure RBAC, networking, compliance and enterprise controlsPrototyping document extraction on AzureFree tier is for testing and has rate/volume limits; paid S0 pricing is region-specific
No tagline
Document AICloud document OCR / IDPAzure AI Document IntelligenceStandard S0Region-specific PAYGPer-page pay-as-you-goF0 free tier existsPaid page pricing varies by feature and region; pricing page lists Read, Layout/prebuilt, custom extraction/classification, query fields, batch and training dimensionsRead OCR, layout, prebuilt models, custom models and batch processingPDFs/images and supported document formatsStructured extraction, query fields, classification and custom extractionREST API, SDKs, Studio and Azure service integrationsAzure data handling by region/resourceAzure managed service or container where availableAzure enterprise governance and networkingProduction extraction workloads in Azure environmentsOfficial page may render regional prices dynamically; use Azure pricing calculator for final SKU numbers
No tagline
Document AIDocument OCR APIMistral OCRmistral-ocr-latest / OCR 3$2/1K pagesPer-page OCR pricingNo free OCR tier captured on pricing pagePricing page lists OCR 3 at $2 per 1,000 pages and annotations at $3 per 1,000 pagesOCR and document understanding with markdown, tables, images, layout and confidence scoresPDF, image URL and document URL inputs; docs mention PDFs, images, PPTX/DOCX and moreExtracts text while preserving hierarchy; can output tables, headers/footers, images and document annotationsMistral OCR API, SDKs and batch inferenceMistral platform file/document handlingMistral hosted API; batch mode for scaleEnterprise terms and data controls depend on Mistral account/contractLow-cost OCR for LLM-ready markdown from complex documentsOCR pricing separate from LLM token pricing; no free quota found in official pricing
No tagline
Document AIDocument OCR APIMistral OCROCR with annotations$3/1K pages annotationsPer-page annotation add-onNo free OCR tier capturedAnnotations priced separately from OCR on pricing page; docs expose document and bbox annotation formatsOCR with bounding boxes, page/word confidence and structured annotation outputPDFs, images and supported document inputsJSON/schema-style annotations for structured document outputsMistral OCR endpoint and SDKsPlatform document/file handlingMistral hosted API; batch inference recommended for scaleEnterprise controls by contractTeams needing OCR plus structured annotation for VLM/document datasetsAnnotation cost stacks with OCR; validate output schema needs before scaling
No tagline
Document AIAI-native document parsingLlamaParse / LlamaCloudStarter / Pro$50/mo Starter; $500/mo ProCredit subscription plus PAYGFree plan existsCommon pricing profile: Starter 40K credits/month, Pro 400K credits/month, with pay-as-you-go credit top-ups; verify current checkout before buyingHigher-volume LlamaParse, LlamaExtract and LlamaCloud document workflowsComplex PDFs, images, tables, charts, forms and multimodal documentsSchema extraction, document agents, indexing and retrieval workflowsAPI/client, LlamaIndex framework and cloud workflowsLlamaCloud account/project retention and privacy controlsLlamaCloud managed platformPaid plans add users/support; enterprise adds SSO/VPC/hybridProduction document-agent teams using LlamaIndexCredit-to-page mapping depends on parsing mode; verify account dashboard for exact rates
No tagline
Document AIUnstructured document processingUnstructuredFree$0One-time/free page allowancePricing page lists 15,000 free pages with no expirationPartitions and cleans unstructured documents for GenAI/RAGPDFs, Office docs, images and many unstructured formats depending pipelineDocument partitioning, chunking, cleaning and RAG-ready outputsAPI, SDKs, platform workflows and open-source libraryCloud/API data handling; enterprise dedicated/VPC optionsHosted API/platform or self-managed open-source libraryDedicated instance/VPC and multi-user access on enterpriseTeams preparing messy enterprise docs for RAGFree library and paid API differ in quality/features; enterprise pricing is sales-led
No tagline
Document AIUnstructured document processingUnstructuredDedicated / EnterpriseCustomSales-led dedicated/VPC pricingFree pages existDedicated instance or VPC with multi-user access, full data isolation, support and tailored pricingProduction document parsing and preprocessing for GenAI at scaleEnterprise file formats and unstructured document corporaChunking, cleaning, extraction and data prep for retrieval and agent pipelinesAPI/platform workflows and enterprise deploymentsDedicated data isolation and custom deployment controlsDedicated cloud instance or VPCMulti-user access, data isolation and dedicated supportEnterprise RAG pipelines with private document corporaNo public unit price for dedicated plans; must scope with sales
No tagline
Document AIOCR/workflow automation APINanonetsStarter$0 entry with $200 creditsRun/block-based credit pricingPricing page: Start free with $200 in credits; no platform fees; up to 3 users; data extraction AI, API access, email integration and cloud storage connectorsData extraction AI for invoices, receipts and document workflowsInvoices, receipts, emails, files and connected storage workflowsOCR/extraction blocks and workflow automationAPI access, email integration, cloud storage connectorsCloud storage connectors and platform handlingNanonets cloud automation platformUp to 3 users on Starter; Growth/Enterprise for larger teamsTesting document automation without platform feeCost depends on number of workflow runs and block prices; pricing calculator/account needed for exact unit costs
No tagline
Document AIOCR/workflow automation APINanonetsGrowth / EnterpriseCustom / volume pricingQuote-based volume pricingStarter credits existGrowth adds classification AI, barcode/signature detection, generative AI blocks, Python blocks, ERP/database integrations and up to 40% volume discount; Enterprise customDocument extraction plus end-to-end automation workflowsInvoices, receipts, forms and business documentsClassification, extraction, generative blocks and custom automationsAPI, email, ERP, database and custom integrationsPlatform storage/connectors; enterprise compliance optionsNanonets hosted platformGrowth up to larger teams; Enterprise for compliance/deployment requirementsHigh-volume AP/ops teams automating document workflowsQuote-based pricing reduces public cost transparency
No tagline
Document AIOCR API / document extractionMindeeStarterEUR 44/mo annual billingMonthly credit subscriptionFree trial available500 credits/month billed annually; additional credits EUR 0.05; unlimited models; community supportOCR APIs for invoices, receipts, bank statements, IDs and custom modelsPage-based physical documents regardless of type/file formatPretrained and custom document extraction; confidence scores and polygons on higher tiersMindee OCR APIs and integrationsData processing localization shown in plan comparison by tierMindee hosted APIMembers/support increase by tier; Enterprise custom SLA/supportPredictable low-volume OCR API usageEUR annual billing; advanced RAG/features start higher
No tagline
Document AIOCR API / document extractionMindeePro / BusinessEUR 179/mo Pro; EUR 584/mo Business annual billingMonthly credit subscription plus overageFree trial availablePro: 2,500 credits/month and RAG for 20 documents; Business: 10,000 credits/month and unlimited RAG; overages EUR 0.04/EUR 0.035 per creditOCR/document extraction for standard and custom document typesPhysical pages across document types and file formatsRAG, polygons, confidence scores, boosted accuracy and priority support by tierAPI integrations and workflow access optionsData processing localization and enterprise options by tierMindee hosted APIPriority support and Enterprise custom SLAsTeams that need OCR plus RAG/document-question workflowsAnnual billing and per-page credits; enterprise needed for custom volume/SLA
No tagline
Document AIReceipt/invoice OCR APIVeryfiFree$0Monthly document quotaPricing page: process up to 100 docs/month free; all document types, SDKs for development, limited storage, email supportMulti-modal OCR/data extraction for invoices, receipts and business documentsInvoices, receipts, checks and other supported document typesLine-item extraction, OCR 3.0, document capture SDK and data extraction APIsVeryfi OCR API, SDKs and docsLimited storage on Free; Vault/custom retention on higher tiersVeryfi hosted platform and SDKsEmail support on Free; Growth adds SAML/SLA/custom retentionDevelopers testing invoice/receipt extraction APIFree limit is 100 docs/month; storage and support limited
No tagline
Document AIReceipt/invoice OCR APIVeryfiStarter / Growth$500/mo minimum Starter; Growth customTransaction-based API pricingFree plan existsStarter minimum $500/mo buying roughly <5K docs/month; FAQ lists receipt $0.08 and invoice $0.16 in Starter; Growth volume discounts and custom termsOCR/data extraction APIs plus SDKs, fraud detection and document capture add-onsInvoices, receipts, checks, purchase orders and other supported docsLine items, extraction, product matching/workflows on higher tiersAPI Hub, SDKs, OpenClaw Skill and add-onsLimited storage on Starter; Growth has Vault, unlimited storage and custom retentionVeryfi hosted API/platformGrowth adds Slack support, SAML SSO, SLA options, model trainingFinance/AP teams needing fast receipt/invoice OCRStarter has a high monthly minimum; add-ons may increase price
No tagline
Document AIPDF services / extraction APIAdobe Acrobat Services APIFree Tier$0Document transactions per month500 free Document Transactions per month; access to 15+ PDF Services including PDF Extract, Auto-Tag, Electronic Seal and Document Generation; no credit cardPDF extraction, generation, conversion, accessibility tagging and PDF workflowsPDF and document service inputs/outputs supported by Acrobat Services APIsPDF Extract can extract text/tables/structure for downstream apps/RAGAdobe PDF Services API and SDKsAdobe service data handling and transaction limitsAdobe cloud APIAdobe developer credentials; paid plans/support for volumeDevelopers needing free monthly PDF extraction/conversion quotaNot a full OCR/IDP suite; transaction accounting varies by operation/output
No tagline
Document AIPDF services / extraction APIAdobe Acrobat Services APIPaid PlansCustom / salesVolume and multi-product discountsFree tier existsPaid plans provide scalable high-volume access to 15+ PDF Services and technical support on certain plansHigh-volume PDF extraction/generation/conversion/auto-tag workflowsPDF and supported document transformationsDocument generation and extract workflows for appsAdobe APIs and SDKsAdobe cloud service handlingAdobe managed APISupport available on certain paid plans; enterprise procurementCompanies embedding PDF APIs into production softwarePublic page does not show self-serve per-transaction paid price
No tagline
Document AIRule-based document parserDocparser14-day free trial$0 trialParsing-credit subscription after trial14-day free trial, no credit card required; 1 parsing credit equals 1 document with up to 5 pagesPDF/Word/image parsing with parser templates and rulesPDF, Word and image filesExtract fields/tables and export structured dataGoogle Sheets export plus many integrations; downloads to Excel, CSV, JSON and XMLDocument retention add-on availableDocparser hosted serviceTeams/managed users on Professional+; MFA/version control add-onsTrying template/rule-based parsing before subscribingTrial only; complex layouts may need paid parsing assistant/setup
No tagline
Document AIRule-based document parserDocparserStarter / Business$39/mo Starter monthly; $159/mo Business monthlyMonthly parsing credits14-day trialStarter monthly: 100 parsing credits/month and up to 15 parsers; Business monthly: 1,000 parsing credits/month, 500 parsers, priority support and multi-layout parsersTemplate/rule-based document extractionPDF, Word and image filesSmart checkboxes/tables, multi-layout parsers and parser version control by tierGoogle Sheets, CSV, JSON, XML and hundreds of integrationsExtended document retention is paid add-on or enterprise featureDocparser cloudTeams/managed users and MFA/version control by tierOperations teams with repeatable document templatesCredit is document up to 5 pages; add-ons can materially change cost
No tagline
Document AIMath/scientific OCR APIMathpixConvert APIUsage-based; no API free trialAPI conversion pricingNo Convert API free trial; Snip app has free planOfficial API pricing page says no free trial for Convert API; Snip app can be used to try capabilitiesOCR and conversion for math, STEM, PDFs and structured formatsImages, PDFs and math/scientific documentsLaTeX/math OCR, PDF conversion and structured outputsMathpix APIs and SDK workflowsPlatform/API account handlingMathpix hosted APIAccount/team controls depend on product planScientific PDFs, equations and STEM document conversionNo permanent free API tier captured; source page should be checked for exact endpoint rates
No tagline
Document AIDocument Q&A / PDF AIHumataFree$0Monthly free pages60 free pages monthly; 1 user; basic featuresChat with PDFs/documents and answer questions from sourcesPDF pages and documents uploaded to HumataDocument Q&A with cited context; OCR starts on Team tier per plan tableWeb app and plan-based workflow; API not emphasized in pricing pageHumata account/cloud storageHumata hosted appSingle user; higher tiers add team/securityStudents/researchers chatting with small PDFsNo OCR on Free according to pricing table; only 60 pages/month
No tagline
Document AIDocument Q&A / PDF AIHumataExpert / Team$9.99/mo Expert; $49/user/mo TeamSubscription plus additional page usageFree plan existsExpert: 500 free pages/month, 3 users, additional pages $0.02/page; Team: 5,000 pages/month, 10 users, additional pages $0.01/page and OCR/security featuresPDF/document Q&A and OCR on Team tierPDFs/documents uploaded to HumataGPT-5 support, OCR, response personalization and permissions by tierWeb app workflow; integrations not primaryCloud account storage/pagesHumata hosted appTeam adds department/folder permissions; Enterprise adds SOC 2/SLASmall teams doing document research and PDF Q&APage overages add cost; OCR only appears at Team tier
No tagline
Document AINotebook/document research assistantNotebookLMFree$0Consumer/product usage limitsGoogle help says users can sign up free; limits include 3 Audio Overviews/day on free tier in upgrade table; sources/notebooks limits are subject to changeGrounded research assistant over uploaded sourcesDocs, PDFs, websites, Google docs/slides and other supported NotebookLM sourcesCited answers, summaries, study guides, Audio Overviews and source-grounded Q&AWeb app; Google Workspace/AI plan integrations for upgraded accessGoogle account data handling; Workspace/enterprise terms varyGoogle hosted productUpgraded plans through Google AI, Cloud or Workspace; enterprise/admin controls by planPersonal research, study and document synthesis without API integrationNot an OCR/API product; limits change and official page says usage limits are subject to change
No tagline
Document AINotebook/document research assistantNotebookLMPlus / Pro / Ultra via Google plansVaries by Google AI/Workspace/Cloud planBundled subscription accessFree plan existsUpgrade page lists higher limits and features through Google AI Plans, Google Cloud or qualifying Workspace plans; Audio Overviews examples include 6/day, 20/day and 200/day tiersHigher-capacity document research assistantUploaded/linked sources supported by NotebookLMMore output generation, higher limits and collaboration/controls depending planGoogle product integrations rather than standalone APIGoogle account/Workspace/Cloud policiesGoogle hosted productWorkspace/Cloud can add admin controlsOrganizations that want NotebookLM workflows with higher limitsPricing is tied to broader Google AI/Workspace products, not a standalone page-based OCR API
No tagline
Document AIOpen-source document parserDoclingOpen source$0 softwareOpen-source software; infra/model costs separateNo software usage meter; install with pip, run CLI/library locally; Docling Serve and Docling MCP availableConverts messy documents into structured data with tables, formulas, reading order, OCR and chunkingPDF, DOCX, PPTX, XLSX, HTML, images, audio transcripts and other formats listed on siteExports JSON, Markdown, HTML, text and chunks for AI/RAG/agent systemsPython library, CLI, Docling Serve and MCPLocal/app-owned unless using external OCR/models/servicesLocal, self-hosted or your own infrastructureGovernance depends on deployment; enterprise support not inherent to OSSPrivate/local document conversion for RAG pipelinesYou own scaling, OCR engine choice and quality tuning
No tagline
Document AIOpen-source document converterMicrosoft MarkItDownOpen source$0 softwareOpen-source library; optional external service costsNo software usage meter; converts files/office docs to Markdown for LLM ingestionLightweight document-to-Markdown conversion preserving important structureLocal files, remote URIs and byte streams; Office docs, PDFs and other formats via plugins/dependenciesMarkdown output for RAG, prompt context and AI ingestionPython package/CLI; optional integrations such as Azure Document Intelligence for some conversionsLocal unless remote URI/service integrations are usedLocal/self-hostedGovernance depends on your environmentSimple file-to-Markdown conversion in AI pipelinesNot a full OCR/IDP platform; quality varies by file type and optional services
No tagline
Document AIOpen-source PDF/document extractionMinerUOpen source$0 softwareOpen-source document parsing engine; infra/model costs separateNo software usage meter; converts complex documents like PDFs and Office docs into LLM-ready Markdown/JSONHigh-accuracy document parsing and layout/content extractionPDFs, images and Office docs per ecosystem docsMarkdown/JSON for LLM pretraining, RAG and agentic workflowsCLI, SDK ecosystem and open-source repositoryLocal/app-owned unless cloud/API services are usedLocal/self-hosted or via ecosystem API if chosenGovernance depends on deploymentResearch/document-heavy RAG pipelines needing open-source extractionRequires local setup/resources; production support and SLAs are self-managed
No tagline
Document AIOpen-source OCR/document extractionDatalab MarkerOpen source / platform$0 software for OSS; hosted/platform options may varyOpen-source models plus platform offeringsYes for OSSDatalab page describes open-source models for extracting text, tables, images and layouts with OCR in 90+ languagesAdvanced OCR and document conversion to structured outputsPDFs, Office documents and imagesText, tables, images, layouts and GitHub Markdown table conversionOpen-source tooling plus Datalab platform/API optionsLocal/app-owned for OSS; platform data handling if hostedLocal/self-hosted for OSS or Datalab platformGovernance depends on chosen deploymentDevelopers needing OCR/table extraction from PDFs into Markdown/JSONHosted pricing not captured here; OSS requires GPU/ops for best performance
No tagline
Document AIOpen-source OCRPaddleOCROpen source$0 softwareOpen-source OCR toolkit; infra/model costs separateNo software usage meter; open-source OCR models and pipelinesOCR, layout/document understanding and multilingual text recognition depending model/pipelineImages, documents and OCR datasets/workflowsText extraction and document understanding for downstream pipelinesPython ecosystem, models and deployment optionsLocal/app-ownedLocal/self-hosted/cloud by userGovernance depends on deploymentTeams needing a mature open-source OCR baselineRequires engineering and model selection; not a managed extraction API
No tagline
Document AIOpen-source RAG / document Q&AAnythingLLMOpen source$0 softwareOpen-source app; model/vector/hosting costs separateNo software usage meter for self-hosted app; all-in-one AI app with RAG and agent capabilitiesDocument ingestion and knowledge-base chatUploaded documents and data sources supported by AnythingLLMRAG, agents and chat over documents/dataWeb app, integrations and model/provider connectorsSelf-hosted or cloud account handling depending deploymentSelf-hosted/local/cloud by userGovernance depends on deployment and editionTeams wanting a ready document-chat app over private docsNot an OCR parser by itself; quality depends on document ingestion and chosen models