PDF Parsing Modes

Firecrawl supports three PDF parsing modes. The core tradeoff is speed vs. robustness across different PDF types.

Picking the right mode

Mode	Best for	Strength	Tradeoff
`fast`	Text-native PDFs	Fast	Won’t extract from scanned/image-heavy pages
`auto`	Default for most PDFs	Text-first with OCR fallback	More complex but more robust for mixed PDFs
`ocr`	Scans, photos, and `auto` misclassification	Most robust	Higher cost and latency

parsers: [{ type: 'pdf', mode: 'ocr', maxPages: 20 }]

parsers: [{ type: 'pdf' }]

Start with auto and switch to ocr only for the PDFs that need it
Use fast when you know the PDFs have a clean embedded text layer
For batch pipelines, sample your PDF distribution first and then choose defaults