Split

Identify sections in PDF documents based on provided descriptions. Split analyzes the document and returns which pages contain each section, along with confidence levels. Use this endpoint when you need to locate specific sections within a document.

POST https://pdf.ai/api/v2/split

Returns JSON schema and citations given a docId , url , or file.

Caching

If a docId is passed, no parsing credits will be used. However, if a file or url is used, parsing credits will apply during the split call. The docId will be returned after this call, allowing users to use it in future split requests without incurring additional parsing credits.

Sample Code

Below are examples of how to use the split endpoint with different programming languages.

curl -X POST https://pdf.ai/api/v2/split \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "docId=your-document-id" \
  -F 'split_description=[{"name":"Introduction","description":"Opening section"},{"name":"Conclusion","description":"Summary section"}]'

import requests
import json

url = "https://pdf.ai/api/v2/split"
headers = {"X-API-Key": "YOUR_API_KEY"}

split_description = [
    {"name": "Introduction", "description": "Opening section"},
    {"name": "Conclusion", "description": "Summary section"}
]

data = {
    "docId": "your-document-id",
    "split_description": json.dumps(split_description)
}

response = requests.post(url, headers=headers, data=data)
print(response.json())

const FormData = require('form-data');
const axios = require('axios');

const splitDescription = [
  { name: "Introduction", description: "Opening section" },
  { name: "Conclusion", description: "Summary section" }
];

const form = new FormData();
form.append('docId', 'your-document-id');
form.append('split_description', JSON.stringify(splitDescription));

axios.post('https://pdf.ai/api/v2/split', form, {
  headers: {
    'X-API-Key': 'YOUR_API_KEY',
    ...form.getHeaders()
  }
}).then(response => {
  console.log(response.data);
});

<?php
$url = "https://pdf.ai/api/v2/split";
$apiKey = "YOUR_API_KEY";

$splitDescription = json_encode([
    ["name" => "Introduction", "description" => "Opening section"],
    ["name" => "Conclusion", "description" => "Summary section"]
]);

$postFields = [
    'docId' => 'your-document-id',
    'split_description' => $splitDescription
];

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postFields);
curl_setopt($ch, CURLOPT_HTTPHEADER, ["X-API-Key: $apiKey"]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$response = curl_exec($ch);
curl_close($ch);

echo $response;
?>

Please replace placeholders like <YOUR_API_Key> with actual values.

Headers

Name

Type

Description

X-API-Key*

string

<API-Key>

Request Format

Content type: multipart/form-data

Request Parameters

Parameter

Type

Required

Description

split_description

string (JSON)

Yes

JSON array of section descriptions to find. Each object must have a name and optionally a description.

docId

string

Document ID for caching parsed results.

url

string

URL of the PDF to parse (alternative to file upload).

file

File

PDF file to upload (alternative to URL).

quality

string

Quality to use: 'standard' or 'advanced' (default: 'standard').

lang_list

array

List of languages to detect (default: ['en']).

Split Description Format

The split_description parameter must be a JSON array of objects:

[
  {
    "name": "Introduction",
    "description": "Opening section that introduces the topic"
  },
  {
    "name": "Conclusion"
  }
]

Response format

{
  "success": true,
  "result": {
    "splits": [
      {
        "name": "Introduction",
        "pages": [1, 2, 3],
        "conf": "high"
      },
      {
        "name": "Conclusion",
        "pages": [45, 46],
        "conf": "medium"
      }
    ]
  },
  "docId": "string"
}

Credit Usage

Before splitting data from a PDF, the document must be parsed, which will incur credit usage unless a cached parsed result is available. See parse credit usage here.

Component

Condition

Credit Calculation

Split Credits

Always charged

2 credits × page count

Total Credit Formula

Total credits = Parse credits + Split credits

Examples

10-page document, cached
- Parse Credits: 0 credits (cached)
- Split Credits: 2 × 10 = 20 credits
- Total: 20 credits
5-page document, not cached, advanced quality
- Parse Credits: 2 × 5 = 10 credits
- Split Credits: 2 × 5 = 10 credits
- Total: 20 credits

PreviousExtract NextAsk

Last updated 4 days ago

Was this helpful?