PDF Table Extraction API
Free tier: 25 requests/mo (not 100). Paid tiers include overage rates—see pricing.PDF table extractor, parser, and converter API. Pull structured table data from PDFs into JSON, Excel, or CSV. Multipart or base64 upload. Page selection, lattice/stream detection. Text-based PDFs only, no OCR.
Why use this API?
Tables in PDFs are hard to extract programmatically. This API detects and extracts tables for spreadsheets, data pipelines, or automation.
What the API does
POST /extract with a PDF file (multipart file or JSON base64). Returns JSON (default), Excel (.xlsx), or CSV (ZIP) of extracted tables. Optional query params: outputFormat, pages, strategy, mergeTablesAcrossPages, confidenceScores.
Request & response schema
View request schema · View response schema · View error & warning codes
Try it in the playground
Add your RapidAPI key and run. Key is sent only to RapidAPI.
Optional parameters (sent as query string):
e.g. all or 1,3-5
File is sent to RapidAPI; not stored here.
Get code
Snippets use the same output format, pages, strategy, merge tables, and confidence scores choices as the playground above. Pick a language and copy, or click Refresh code after changing options.
Host: pdf-table-extraction-api.p.rapidapi.com, POST /extract, multipart field file. JSON by default; xlsx / csv return binary (Excel or ZIP).
Replace YOUR_RAPIDAPI_KEY and the file path. Java example requires OkHttp on the classpath.
What to expect
With outputFormat=json you get a JSON object with tables and summary. With xlsx or csv you get a binary file (Excel or ZIP of CSVs). Use your RapidAPI key in headers. Stateless; no data stored. See RapidAPI docs for full response schema.
Pricing & tiers (RapidAPI)
Quotas and overage differ from our typical 100 / 10k / 100k / 250k listings. Pro, Ultra, and Mega include per-request overage after the included monthly calls.
Basic
$0/mo
25 requests/month included
No overage shown (hard cap unless RapidAPI lists otherwise).
Pro
$9.99/mo
3,500 requests/month included
Overage: ~$0.005 per extra request.
Ultra
$30/mo
15,000 requests/month included
Overage: ~$0.003 per extra request.
Mega
$99/mo
100,000 requests/month included
Overage: ~$0.002 per extra request.
Confirm current prices on the live PDF Table Extraction API listing.
About this API
Who Should Use This API
Data teams and apps extracting tables from PDF reports and documents.
Also Known As
PDF table extractor API, PDF to table, table extraction API.
PDF Table Extraction API
Extract structured table data from PDF documents. The API makes a best-effort attempt to detect and extract tables within PDFs, returning the data in JSON (default), Excel (.xlsx), or CSV (ZIP) formats.
What This API Does
- **Table detection** — Identifies table structures within PDFs
- **Multi-table support** — Extracts all tables found, preserving order
- **Multiple output formats** — JSON (default), Excel (.xlsx), or CSV (ZIP archive)
- **Page selection** — Process all pages or specific page ranges
- **Confidence scores** — Optional confidence metrics per extracted table
- **Flexible input** — Multipart file upload or JSON body with base64-encoded PDF
**Important:** Best-effort detection. Non-table content (titles, paragraphs, headers) is filtered out. Results depend on PDF structure and quality. Text-based PDFs only; no OCR for scanned documents.
Key Features
- **Stateless** — No data stored; 25MB max, 60s timeout
- **Strategies** — `lattice` (grid lines), `stream` (most tables), `auto`
- **Metadata** — Page ranges, row/column counts, warnings
Use Cases
- **Data extraction** — Pull tables from reports, invoices, exports
- **Automation** — Integrate table extraction into pipelines
- **Spreadsheet import** — Get Excel/CSV for downstream tools
Frequently Asked Questions
-
Basic: 25/mo free. Pro: $9.99 for 3,500/mo + ~$0.005 overage. Ultra: $30 for 15,000/mo + ~$0.003 overage. Mega: $99 for 100,000/mo + ~$0.002 overage. Verify on RapidAPI.
-
No. Fully stateless.
-
Returns structured table data (e.g. JSON).