Skip to main content

Overview

The Batch API enables asynchronous processing of large volumes of requests at reduced costs. It’s ideal for:
  • Bulk content generation
  • Large-scale data analysis
  • Embedding generation for datasets
  • Any task that doesn’t require immediate responses

Endpoints

File Management

POST   /v1/files
GET    /v1/files/{file_id}
GET    /v1/files/{file_id}/content
DELETE /v1/files/{file_id}

Batch Operations

POST   /v1/batches
GET    /v1/batches/{batch_id}
GET    /v1/batches
POST   /v1/batches/{batch_id}/cancel

Authentication

Authorization: Bearer <API_KEY>

File Upload

Upload JSONL files containing batch requests.

Request Format

Endpoint: POST /v1/files Headers:
  • Authorization: Bearer YOUR_API_KEY (required)
  • Content-Type: multipart/form-data (required)
Form Data:
FieldTypeRequiredDescription
filefileYesJSONL file containing requests (max 100MB)
purposestringYesMust be "batch"

Response Format

{
  "id": "file-abc123",
  "object": "file",
  "bytes": 2048,
  "created_at": 1699123456,
  "filename": "batch_requests.jsonl",
  "purpose": "batch"
}

JSONL Request Format

Each line in the JSONL file must be a valid JSON object:
{
  "custom_id": "request-1",
  "method": "POST",
  "url": "/v1/chat/completions",
  "body": {
    "model": "gpt-3.5-turbo",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "temperature": 0.7
  }
}

Request Fields

FieldTypeRequiredDescription
custom_idstringYesUnique identifier for tracking
methodstringYesMust be "POST"
urlstringYesTarget endpoint (e.g., /v1/chat/completions)
bodyobjectYesRequest body for the endpoint

Create Batch

Submit a file for batch processing.

Request Format

Endpoint: POST /v1/batches Request Body:
{
  "input_file_id": "file-abc123",
  "endpoint": "/v1/chat/completions",
  "completion_window": "24h",
  "metadata": {
    "customer_id": "12345",
    "batch_name": "product_descriptions"
  }
}

Parameters

FieldTypeRequiredDescription
input_file_idstringYesID of uploaded JSONL file
endpointstringYesTarget API endpoint
completion_windowstringNoProcessing window (default: "24h")
metadataobjectNoKey-value pairs (max 16)

Response Format

{
  "id": "batch_abc123",
  "object": "batch",
  "endpoint": "/v1/chat/completions",
  "errors": null,
  "input_file_id": "file-abc123",
  "completion_window": "24h",
  "status": "validating",
  "output_file_id": null,
  "error_file_id": null,
  "created_at": 1699123456,
  "in_progress_at": null,
  "expires_at": 1699209856,
  "finalizing_at": null,
  "completed_at": null,
  "failed_at": null,
  "expired_at": null,
  "cancelling_at": null,
  "cancelled_at": null,
  "request_counts": {
    "total": 0,
    "completed": 0,
    "failed": 0
  },
  "metadata": {
    "customer_id": "12345",
    "batch_name": "product_descriptions"
  }
}

Batch Status

Status Values

StatusDescription
validatingInitial validation of batch request
failedBatch failed to process
in_progressCurrently processing requests
finalizingPreparing result files
completedSuccessfully processed
expiredExceeded 24-hour window
cancellingCancellation in progress
cancelledSuccessfully cancelled

Retrieve Batch

Get the current status of a batch. Endpoint: GET /v1/batches/{batch_id} Returns the same format as batch creation response with updated status and counts.

List Batches

List all batches with pagination support. Endpoint: GET /v1/batches

Query Parameters

ParameterTypeRequiredDescription
afterstringNoCursor for pagination
limitintegerNoResults per page (default: 20, max: 100)

Response Format

{
  "object": "list",
  "data": [
    {
      "id": "batch_abc123",
      "object": "batch",
      "status": "completed",
      ...
    }
  ],
  "first_id": "batch_abc123",
  "last_id": "batch_xyz789",
  "has_more": false
}

Cancel Batch

Cancel an in-progress batch. Endpoint: POST /v1/batches/{batch_id}/cancel Returns the batch object with updated cancellation timestamps.

Output Format

Completed batches produce JSONL output files where each line contains:
{
  "id": "batch_req_123",
  "custom_id": "request-1",
  "response": {
    "status_code": 200,
    "request_id": "req_abc123",
    "body": {
      "id": "chatcmpl-123",
      "object": "chat.completion",
      "created": 1699123456,
      "model": "gpt-3.5-turbo",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "The capital of France is Paris."
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 20,
        "completion_tokens": 8,
        "total_tokens": 28
      }
    }
  },
  "error": null
}
For failed requests:
{
  "id": "batch_req_124",
  "custom_id": "request-2",
  "response": null,
  "error": {
    "code": "invalid_request_error",
    "message": "Model 'invalid-model' not found"
  }
}

Usage Examples

Complete Workflow

# 1. Prepare JSONL file
cat > requests.jsonl << EOF
{"custom_id": "req-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "Hello!"}]}}
{"custom_id": "req-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "How are you?"}]}}
EOF

# 2. Upload file
FILE_ID=$(curl -X POST http://localhost:3000/v1/files \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F purpose="batch" \
  -F file="@requests.jsonl" \
  | jq -r '.id')

# 3. Create batch
BATCH_ID=$(curl -X POST http://localhost:3000/v1/batches \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input_file_id": "'$FILE_ID'",
    "endpoint": "/v1/chat/completions",
    "completion_window": "24h"
  }' \
  | jq -r '.id')

# 4. Check status
curl http://localhost:3000/v1/batches/$BATCH_ID \
  -H "Authorization: Bearer YOUR_API_KEY" \
  | jq '.status'

# 5. Download results when completed
OUTPUT_FILE_ID=$(curl http://localhost:3000/v1/batches/$BATCH_ID \
  -H "Authorization: Bearer YOUR_API_KEY" \
  | jq -r '.output_file_id')

curl http://localhost:3000/v1/files/$OUTPUT_FILE_ID/content \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -o results.jsonl

Python Example

import requests
import json
import time

API_KEY = "YOUR_API_KEY"
BASE_URL = "http://localhost:3000"
headers = {"Authorization": f"Bearer {API_KEY}"}

# 1. Prepare requests
requests_data = [
    {
        "custom_id": f"req-{i}",
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "gpt-3.5-turbo",
            "messages": [
                {"role": "user", "content": f"Tell me fact #{i} about Python"}
            ]
        }
    }
    for i in range(1, 101)
]

# 2. Write JSONL file
with open("batch_requests.jsonl", "w") as f:
    for req in requests_data:
        f.write(json.dumps(req) + "\n")

# 3. Upload file
with open("batch_requests.jsonl", "rb") as f:
    files = {"file": f}
    data = {"purpose": "batch"}
    response = requests.post(
        f"{BASE_URL}/v1/files",
        headers=headers,
        files=files,
        data=data
    )
    file_id = response.json()["id"]

# 4. Create batch
batch_data = {
    "input_file_id": file_id,
    "endpoint": "/v1/chat/completions",
    "completion_window": "24h",
    "metadata": {
        "project": "python_facts",
        "version": "1.0"
    }
}

response = requests.post(
    f"{BASE_URL}/v1/batches",
    headers={**headers, "Content-Type": "application/json"},
    json=batch_data
)
batch_id = response.json()["id"]

# 5. Poll for completion
while True:
    response = requests.get(f"{BASE_URL}/v1/batches/{batch_id}", headers=headers)
    batch = response.json()

    print(f"Status: {batch['status']}")
    print(f"Progress: {batch['request_counts']['completed']}/{batch['request_counts']['total']}")

    if batch["status"] in ["completed", "failed", "expired", "cancelled"]:
        break

    time.sleep(30)  # Poll every 30 seconds

# 6. Download results
if batch["status"] == "completed":
    output_file_id = batch["output_file_id"]
    response = requests.get(
        f"{BASE_URL}/v1/files/{output_file_id}/content",
        headers=headers
    )

    # Process results
    for line in response.text.strip().split("\n"):
        result = json.loads(line)
        custom_id = result["custom_id"]
        if result["response"]:
            content = result["response"]["body"]["choices"][0]["message"]["content"]
            print(f"{custom_id}: {content[:100]}...")
        else:
            print(f"{custom_id}: Error - {result['error']['message']}")

# 7. Clean up
requests.delete(f"{BASE_URL}/v1/files/{file_id}", headers=headers)
if batch.get("output_file_id"):
    requests.delete(f"{BASE_URL}/v1/files/{output_file_id}", headers=headers)

JavaScript Example

const fs = require('fs').promises;
const fetch = require('node-fetch');
const FormData = require('form-data');

const API_KEY = 'YOUR_API_KEY';
const BASE_URL = 'http://localhost:3000';

async function processBatch() {
  // 1. Prepare requests
  const requests = Array.from({ length: 50 }, (_, i) => ({
    custom_id: `req-${i + 1}`,
    method: 'POST',
    url: '/v1/chat/completions',
    body: {
      model: 'gpt-3.5-turbo',
      messages: [
        { role: 'user', content: `Generate a product description for item #${i + 1}` }
      ]
    }
  }));

  // 2. Write JSONL file
  const jsonlContent = requests.map(req => JSON.stringify(req)).join('\n');
  await fs.writeFile('batch_requests.jsonl', jsonlContent);

  // 3. Upload file
  const formData = new FormData();
  formData.append('purpose', 'batch');
  formData.append('file', await fs.readFile('batch_requests.jsonl'), 'batch_requests.jsonl');

  const uploadResponse = await fetch(`${BASE_URL}/v1/files`, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${API_KEY}`,
      ...formData.getHeaders()
    },
    body: formData
  });

  const { id: fileId } = await uploadResponse.json();

  // 4. Create batch
  const batchResponse = await fetch(`${BASE_URL}/v1/batches`, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      input_file_id: fileId,
      endpoint: '/v1/chat/completions',
      completion_window: '24h'
    })
  });

  const { id: batchId } = await batchResponse.json();

  // 5. Poll for completion
  let batch;
  do {
    await new Promise(resolve => setTimeout(resolve, 30000)); // Wait 30 seconds

    const statusResponse = await fetch(`${BASE_URL}/v1/batches/${batchId}`, {
      headers: { 'Authorization': `Bearer ${API_KEY}` }
    });

    batch = await statusResponse.json();
    console.log(`Status: ${batch.status}, Progress: ${batch.request_counts.completed}/${batch.request_counts.total}`);
  } while (!['completed', 'failed', 'expired', 'cancelled'].includes(batch.status));

  // 6. Process results
  if (batch.status === 'completed') {
    const resultsResponse = await fetch(
      `${BASE_URL}/v1/files/${batch.output_file_id}/content`,
      { headers: { 'Authorization': `Bearer ${API_KEY}` } }
    );

    const results = await resultsResponse.text();
    const lines = results.trim().split('\n');

    for (const line of lines) {
      const result = JSON.parse(line);
      console.log(`${result.custom_id}: ${result.response ? 'Success' : 'Failed'}`);
    }
  }
}

processBatch().catch(console.error);

Best Practices

  • Batch Size: Keep batches under 50,000 requests for optimal processing
  • File Size: Ensure JSONL files are under 100MB
  • Custom IDs: Use meaningful identifiers for easy tracking
  • Validation: Validate JSONL format before uploading
  • Polling: Check status every 30-60 seconds to avoid rate limits
  • Error Handling: Process partial failures gracefully
  • Cleanup: Delete processed files to manage storage
  • Metadata: Use metadata fields for additional context and filtering

Limitations

  • Maximum requests per batch: 50,000
  • Maximum file size: 100MB
  • Completion window: 24 hours
  • Metadata entries: Maximum 16 key-value pairs
  • Supported endpoints: /v1/chat/completions, /v1/embeddings, and other compatible endpoints

Error Handling

The batch API automatically retries transient failures. Failed requests are included in the output file with error details, allowing you to:
  • Identify specific failures
  • Retry failed requests
  • Adjust parameters based on error messages
  • Track success rates

Supported providers

OpenAI

Native batch API support with cost savings for large-scale processing.

Anthropic

Batch processing for Claude models with automatic retry handling.

Together.AI

Efficient batch inference for open-source models at scale.