Skills
All Skills

Document Parser

v1.2.3
unstructured · Data & Files
16.8K installs523 starsMITUpdated 2026-03-30

Parse and extract structured data from PDFs, Word docs, spreadsheets, and images with OCR.

pdfocrdocumentsextractionparsing
npx agentmag add doc-parser

About

The Document Parser skill handles the messy reality of business documents. It can parse PDFs (including scanned ones via OCR), Word documents, Excel spreadsheets, PowerPoint presentations, and images — extracting text, tables, and metadata into clean, structured formats.

Goes beyond simple text extraction: it understands document layout, preserves table structure, handles multi-column text, and can extract specific fields from invoices, contracts, and forms.

Capabilities

  • PDF parsing with layout-aware text extraction
  • OCR for scanned documents and images
  • Table extraction preserving row/column structure
  • Word, Excel, PowerPoint, and CSV support
  • Invoice and form field extraction
  • Batch processing for document collections

Compatible agents

Claude CodeCursorWindsurf

Add Document Parser to your agent

One command to install. Works with all major coding agents.

Building an agent skill? Submit it for free or get featured placement.