Parse

Parses a PDF into Markdown with support for chunking, table formatting, and inline image embedding via data URLs. Ideal for content extraction, knowledge base creation, and retrieval-augmented generation (RAG) workflows.

Parse a PDF

POST https://pdf.ai/api/v1/parse

Returns markdown of a PDF using a docId or a url .

Headers

Name
Type
Description

X-API-Key*

string

<API-Key>

Request Body

Name
Type
Description

docId

string

Document ID obtained after uploading a PDF

url

string

Instead of a docId supply a PDF URL.

llm

boolean

Improve accuracy by using a LLM. Defaults to false

chunk

"page"

If set to "page" chunks content according to document page. Defaults to entire text.

{
    "success": true,
    "url": "https://example.com/document.pdf",
    "content": "The page content."
}

Credit Usage

Every document processing request uses credits based on the parsing mode:

  • Non-LLM Parsing: 1 credit for up to every 5 pages (e.g., a 12-page document uses 3 credits).

  • LLM Parsing: 2 credits for up to every 5 pages (e.g., a 12-page document uses 6 credits).

Last updated

Was this helpful?