Scrape 请求字段

Endpoint

POST https://api.firecrawl.dev/v2/scrape

Header：

字段	类型	必填	默认值	说明
`url`	string (uri)	是	-	要解析/抓取的 URL

字段	类型	默认值	说明
`formats`	array	`["markdown"]`	输出格式数组，支持字符串或对象格式（详见输出格式）
`onlyMainContent`	boolean	`true`	只返回正文，尽量去掉 header/nav/footer 等模板内容
`includeTags`	string[]	-	只保留匹配的元素（HTML tag / class / id 选择器）
`excludeTags`	string[]	-	排除匹配的元素（HTML tag / class / id 选择器）
`removeBase64Images`	boolean	`true`	从 Markdown 输出中移除 base64 图片（保留 alt 文本，用占位符替换 URL）

字段	类型	默认值	说明
`maxAge`	number (ms)	`172800000`	缓存新鲜度窗口；命中缓存可加速但不降低 credit
`minAge`	number (ms)	-	只查缓存且不触发新抓取；无缓存时返回 404 且 error code 为 `SCRAPE_NO_CACHED_DATA`
`waitFor`	number (ms)	`0`	额外等待时间（在 smart-wait 之外）
`timeout`	number (ms)	`60000`	超时，范围 `1000..300000`

字段	类型	默认值	说明
`headers`	object	-	自定义请求头（cookie/user-agent 等）；部分敏感参数可能强制 `storeInCache=false`
`proxy`	`"basic" \| "enhanced" \| "auto"`	`"auto"`	代理策略；`enhanced` 更稳但可能更贵
`blockAds`	boolean	`true`	启用广告拦截与 cookie 弹窗拦截
`skipTlsVerification`	boolean	`true`	跳过 TLS 证书校验
`mobile`	boolean	`false`	模拟移动端抓取
`location`	object	-	位置与语言偏好（详见 `country/languages`）

字段	类型	默认值	说明
`parsers`	array	`["pdf"]`	控制文件解析；PDF 默认会被解析为 Markdown（按页计费）。传 `[]` 会跳过解析并返回 PDF base64（整份 PDF 固定 1 credit）

PDF parser 对象写法：

parsers: [{ type: 'pdf', mode: 'auto', maxPages: 20 }]

字段	类型	默认值	说明
`actions`	array	-	抓取前执行浏览器动作（click/write/wait 等），详见 Actions 字段
`profile`	object	-	启用持久化浏览器状态（cookies/localStorage）用于 scrape + interact 共享会话

字段	类型	默认值	说明
`storeInCache`	boolean	`true`	是否把结果写入 Firecrawl 缓存与索引；使用 `actions/headers` 等敏感参数可能强制为 `false`
`zeroDataRetention`	boolean	`false`	零数据保留模式（需要联系 Firecrawl 开通）