Processing…

HCODX/PDF to Excel
Local-only · No upload · No watermark

PDF to Excel: extract tables to XLSX online free

Free in-browser PDF to Excel converter. Pulls text out of every PDF page, groups it into rows by y-position and into columns by x-position, then writes a fresh .xlsx with one sheet per page. Powered by PDF.js + SheetJS — no upload.

Drop a PDF to convert

Or click to choose. One file at a time. Stays on your device.

Choose PDF
No PDF loaded yet — drop one above.
Options
PagesLeave blank for all.
Row tolerance (pt)Items within this y-distance are the same row.
Column tolerance (pt)Cluster x-positions within this distance.
Progress
idle
Use cases

When to convert PDF to Excel

Bank statements

Pull transaction tables out of monthly bank PDFs into a spreadsheet for budgeting.

Invoices & receipts

Extract line items into accounting software without retyping.

Reports & data tables

Get research data out of PDF reports for re-analysis in Excel or Python.

Inventory lists

Reformat printed inventory sheets into editable spreadsheets.

FAQ

PDF to Excel — frequently asked questions

Drop the PDF in, optionally pick a page range, then click Convert. Each page's text is read with PDF.js, grouped into rows by y-position and split into columns by x-position. The result is written to a fresh .xlsx with one sheet per page.

Very accurate for PDFs that contain real text (most reports, bank statements, exports). PDFs that are just scanned images need to be OCR'd first — use our PDF OCR tool. Complex multi-table layouts or merged cells may need cleanup in Excel after import.

No. PDF.js parses the file and SheetJS writes the XLSX entirely inside your browser. The PDF never leaves your device.

Scanned PDFs are images of text with no real text data. Run them through our PDF OCR tool first to add a selectable text layer, then convert with this tool.

About

How table detection works

A PDF doesn't have rows and columns — it has text fragments positioned at exact (x, y) coordinates. To rebuild a table we use two heuristics:

  • Rows. Text fragments whose y-coordinates are within the row tolerance (3 pt by default) are treated as one row.
  • Columns. Within each page we cluster every fragment's x-coordinate; clusters within the column tolerance (6 pt) become column boundaries. Each row's fragments are then assigned to the column whose anchor is closest.

When to tweak the tolerances

  • Dense tables with closely-spaced rows → lower row tolerance.
  • Tables with wide columns and varying alignment → raise column tolerance.

Privacy

Everything runs in this tab. No bytes are uploaded.

Related

Related tools