Skills
Skills
All Skillsunstructured · Data & Files
Document Parser
v1.2.316.8K installs523 starsMITUpdated 2026-03-30
Parse and extract structured data from PDFs, Word docs, spreadsheets, and images with OCR.
pdfocrdocumentsextractionparsing
npx agentmag add doc-parserAbout
The Document Parser skill handles the messy reality of business documents. It can parse PDFs (including scanned ones via OCR), Word documents, Excel spreadsheets, PowerPoint presentations, and images — extracting text, tables, and metadata into clean, structured formats.
Goes beyond simple text extraction: it understands document layout, preserves table structure, handles multi-column text, and can extract specific fields from invoices, contracts, and forms.
Capabilities
- PDF parsing with layout-aware text extraction
- OCR for scanned documents and images
- Table extraction preserving row/column structure
- Word, Excel, PowerPoint, and CSV support
- Invoice and form field extraction
- Batch processing for document collections
Compatible agents
Claude CodeCursorWindsurf
Add Document Parser to your agent
One command to install. Works with all major coding agents.
Building an agent skill? Submit it for free or get featured placement.