Document Processing

Endpoint

POST /v1/documents

Authentication

Authorization: Bearer <API_KEY>

Request Body

Required Fields

Field	Type	Description
`model`	string	Model identifier for document processing (e.g., `buddoc-v1`, `pixtral-12b`)
`document`	object	Document input specification with type and URL

Document Object Format

{
  "type": "document_url" | "image_url",
  "document_url": "string",  // Required when type is "document_url"
  "image_url": "string"       // Required when type is "image_url"
}

Optional Fields

Field	Type	Default	Description
`prompt`	string	null	Optional prompt to guide extraction or ask specific document questions

Example Requests

Basic PDF Extraction

{
  "model": "buddoc-v1",
  "document": {
    "type": "document_url",
    "document_url": "https://example.com/document.pdf"
  }
}

Image OCR with Guided Extraction

{
  "model": "buddoc-v1",
  "document": {
    "type": "image_url",
    "image_url": "https://example.com/invoice.png"
  },
  "prompt": "Extract the invoice number, date, total amount, and line items as a structured JSON"
}

Multi-page Document Analysis

{
  "model": "buddoc-v1",
  "document": {
    "type": "document_url",
    "document_url": "https://example.com/report.pdf"
  },
  "prompt": "Summarize the key findings and extract all data tables"
}

Response Format

Success Response (200 OK)

{
  "id": "doc_123e4567-e89b-12d3-a456-426614174000",
  "object": "document",
  "created": 1699536000,
  "model": "buddoc-v1",
  "document_id": "550e8400-e29b-41d4-a716-446655440000",
  "pages": [
    {
      "page_number": 1,
      "markdown": "# Document Title\n\nExtracted content from page 1..."
    },
    {
      "page_number": 2,
      "markdown": "## Section 2\n\nContent from page 2 with tables..."
    }
  ],
  "usage_info": {
    "pages_processed": 2,
    "size_bytes": 245678,
    "filename": "document.pdf"
  }
}

Response Fields

Field	Type	Description
`id`	string	Unique identifier for the processing request
`object`	string	Always “document”
`created`	integer	Unix timestamp of processing completion
`model`	string	Model used for processing
`document_id`	string	Unique identifier for the processed document
`pages`	array	Array of extracted page content
`pages[].page_number`	integer	Page number (1-indexed)
`pages[].markdown`	string	Extracted content in markdown format
`usage_info`	object	Document processing metadata
`usage_info.pages_processed`	integer	Number of pages processed
`usage_info.size_bytes`	integer	Document size in bytes
`usage_info.filename`	string	Name of the processed file

Supported Document Types

Document URLs

PDF - Portable Document Format files
DOCX - Microsoft Word documents (provider-dependent)
PPTX - PowerPoint presentations (provider-dependent)
TXT - Plain text files

Image URLs

PNG - Portable Network Graphics
JPEG/JPG - Joint Photographic Experts Group
GIF - Graphics Interchange Format
BMP - Bitmap images
WEBP - Web Picture format

Markdown Output Structure

The extracted content is returned in well-structured markdown format:

Headers

Document sections are preserved with appropriate heading levels (#, ##, ###)

Tables

Extracted as markdown tables:

| Column 1 | Column 2 | Column 3 |
|----------|----------|----------|
| Data 1   | Data 2   | Data 3   |

Lists

Both bullet points and numbered lists are maintained:

- Item 1
- Item 2
  - Nested item

1. First item
2. Second item

Formatting

Bold (**text**), italic (*text*), and code (`code`) formatting preserved

Links

URLs and references extracted as markdown links: [text](url)

Model Configuration

Models must have the document endpoint capability:

[models."buddoc-v1"]
routing = ["buddoc"]
endpoints = ["document"]

[models."buddoc-v1".providers.buddoc]
type = "buddoc"
api_base = "http://buddoc-service:8000"
api_key_location = { env = "BUDDOC_API_KEY" }

Error Handling

Common Error Responses

{
  "error": {
    "type": "invalid_request_error",
    "message": "Model 'gpt-4' does not support document processing"
  }
}

Error Types

Error Type	Description	Common Causes
`invalid_request_error`	Invalid request parameters	Missing required fields, wrong model
`authentication_error`	Authentication failed	Invalid or missing API key
`not_found_error`	Document cannot be accessed	Invalid URL, network issues
`processing_error`	Document processing failed	Corrupted file, unsupported format
`size_limit_error`	Document exceeds limits	File > 100MB, too many pages
`timeout_error`	Processing timeout	Large documents, slow network

Use Cases

Invoice Processing

Extract structured data from invoices:

curl -X POST https://api.example.com/v1/documents \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "buddoc-v1",
    "document": {
      "type": "image_url",
      "image_url": "https://example.com/invoice.png"
    },
    "prompt": "Extract: invoice_number, date, vendor, line_items[], total_amount"
  }'

Contract Analysis

Analyze legal documents:

curl -X POST https://api.example.com/v1/documents \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "buddoc-v1",
    "document": {
      "type": "document_url",
      "document_url": "https://example.com/contract.pdf"
    },
    "prompt": "Extract key terms, obligations, dates, and potential risks"
  }'

Research Paper Summarization

Extract insights from academic papers:

curl -X POST https://api.example.com/v1/documents \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "buddoc-v1",
    "document": {
      "type": "document_url",
      "document_url": "https://arxiv.org/pdf/2301.00001.pdf"
    },
    "prompt": "Summarize the abstract, methodology, key findings, and conclusions"
  }'

Code Examples

Python

import requests
import json

class DocumentProcessor:
    def __init__(self, api_key, base_url="https://api.example.com"):
        self.api_key = api_key
        self.base_url = base_url

    def process_document(self, document_url, doc_type="document_url",
                        model="buddoc-v1", prompt=None):
        """Process a document and extract structured content."""

        url = f"{self.base_url}/v1/documents"
        headers = {
            "Authorization": f"Bearer {self.api_key}",
            "Content-Type": "application/json"
        }

        payload = {
            "model": model,
            "document": {
                "type": doc_type,
                f"{doc_type}": document_url
            }
        }

        if prompt:
            payload["prompt"] = prompt

        response = requests.post(url, headers=headers, json=payload)
        response.raise_for_status()

        return response.json()

    def extract_tables(self, document_url):
        """Extract all tables from a document."""
        return self.process_document(
            document_url,
            prompt="Extract all tables as structured data"
        )

# Usage
processor = DocumentProcessor(api_key="your-api-key")

# Process invoice
invoice_data = processor.process_document(
    "https://example.com/invoice.pdf",
    prompt="Extract invoice details as JSON"
)

# Extract content page by page
for page in invoice_data["pages"]:
    print(f"Page {page['page_number']}:")
    print(page["markdown"])

TypeScript/Node.js

import axios from 'axios';

interface DocumentInput {
  type: 'document_url' | 'image_url';
  document_url?: string;
  image_url?: string;
}

interface PageResult {
  page_number: number;
  markdown: string;
}

interface DocumentResponse {
  id: string;
  object: string;
  created: number;
  model: string;
  document_id: string;
  pages: PageResult[];
  usage_info: {
    pages_processed: number;
    size_bytes: number;
    filename: string;
  };
}

class DocumentProcessor {
  private apiKey: string;
  private baseUrl: string;

  constructor(apiKey: string, baseUrl = 'https://api.example.com') {
    this.apiKey = apiKey;
    this.baseUrl = baseUrl;
  }

  async processDocument(
    documentUrl: string,
    docType: 'document_url' | 'image_url' = 'document_url',
    model = 'buddoc-v1',
    prompt?: string
  ): Promise<DocumentResponse> {
    const url = `${this.baseUrl}/v1/documents`;

    const payload = {
      model,
      document: {
        type: docType,
        [docType]: documentUrl
      } as DocumentInput,
      ...(prompt && { prompt })
    };

    try {
      const response = await axios.post<DocumentResponse>(url, payload, {
        headers: {
          'Authorization': `Bearer ${this.apiKey}`,
          'Content-Type': 'application/json'
        }
      });

      return response.data;
    } catch (error) {
      if (axios.isAxiosError(error)) {
        throw new Error(`Document processing failed: ${error.response?.data?.error?.message || error.message}`);
      }
      throw error;
    }
  }

  async extractStructuredData<T = any>(
    documentUrl: string,
    schema: string
  ): Promise<T> {
    const result = await this.processDocument(
      documentUrl,
      'document_url',
      'buddoc-v1',
      `Extract data according to this JSON schema: ${schema}`
    );

    // Parse structured data from markdown
    const combinedMarkdown = result.pages
      .map(p => p.markdown)
      .join('\n\n');

    // Extract JSON from markdown (assuming it's in a code block)
    const jsonMatch = combinedMarkdown.match(/```json\n([\s\S]*?)\n```/);
    if (jsonMatch) {
      return JSON.parse(jsonMatch[1]) as T;
    }

    throw new Error('No structured data found in document');
  }
}

// Usage
const processor = new DocumentProcessor(process.env.API_KEY!);

// Process document
async function analyzeDocument() {
  try {
    const result = await processor.processDocument(
      'https://example.com/report.pdf',
      'document_url',
      'buddoc-v1',
      'Summarize key findings'
    );

    console.log(`Processed ${result.usage_info.pages_processed} pages`);

    result.pages.forEach(page => {
      console.log(`Page ${page.page_number}:`);
      console.log(page.markdown);
    });
  } catch (error) {
    console.error('Error:', error);
  }
}

analyzeDocument();

Limitations

Limitation	Details	Recommendation
File Size	Maximum 100MB per document	Split large documents
Page Count	Optimal for < 100 pages	Process in batches for large docs
Processing Time	Varies with document complexity	Implement appropriate timeouts
Format Support	Provider-dependent	Check provider documentation
Language Support	Best for English, varies by model	Use specialized models for languages
Table Complexity	Complex nested tables may lose structure	Post-process for complex layouts

/v1/chat/completions - For conversational AI about document content
/v1/embeddings - Create vector embeddings of document content
/v1/images/generations - Generate images from extracted text

Self-Hosting

Development

OpenAI-Compatible API

Features

Cluster Setup

Endpoint

Authentication

Request Body

Required Fields

Document Object Format

Optional Fields

Example Requests

Basic PDF Extraction

Image OCR with Guided Extraction

Multi-page Document Analysis

Response Format

Success Response (200 OK)

Response Fields

Supported Document Types

Document URLs

Image URLs

Markdown Output Structure

Headers

Tables

Lists

Formatting

Links

Model Configuration

Error Handling

Common Error Responses

Error Types

Use Cases

Invoice Processing

Contract Analysis

Research Paper Summarization

Code Examples

Python

TypeScript/Node.js

Limitations

Self-Hosting

Development

OpenAI-Compatible API

Features

Cluster Setup

​Endpoint

​Authentication

​Request Body

​Required Fields

​Document Object Format

​Optional Fields

​Example Requests

​Basic PDF Extraction

​Image OCR with Guided Extraction

​Multi-page Document Analysis

​Response Format

​Success Response (200 OK)

​Response Fields

​Supported Document Types

​Document URLs

​Image URLs

​Markdown Output Structure

​Headers

​Tables

​Lists

​Formatting

​Links

​Model Configuration

​Error Handling

​Common Error Responses

​Error Types

​Use Cases

​Invoice Processing

​Contract Analysis

​Research Paper Summarization

​Code Examples

​Python

​TypeScript/Node.js

​Limitations

​Related Endpoints

Endpoint

Authentication

Request Body

Required Fields

Document Object Format

Optional Fields

Example Requests

Basic PDF Extraction

Image OCR with Guided Extraction

Multi-page Document Analysis

Response Format

Success Response (200 OK)

Response Fields

Supported Document Types

Document URLs

Image URLs

Markdown Output Structure

Headers

Tables

Lists

Formatting

Links

Model Configuration

Error Handling

Common Error Responses

Error Types

Use Cases

Invoice Processing

Contract Analysis

Research Paper Summarization

Code Examples

Python

TypeScript/Node.js

Limitations

Related Endpoints