Responses API - Bud Stack Documentation

Overview

The Responses API provides a next-generation interface for complex AI interactions, supporting:

Prompt-based execution: Execute versioned prompt templates with variable substitution
MCP tool integration: Access Model Context Protocol tools for extended functionality
Structured outputs: JSON schema-validated responses for reliable data extraction
Array-based outputs: Multiple output types (messages, tool calls, reasoning, MCP tool lists)
Multi-turn conversations with context preservation
Parallel tool/function calling
Multimodal inputs (text, image, audio)
Reasoning model capabilities
Streaming responses

Endpoints

POST   /v1/responses
GET    /v1/responses/{response_id}
DELETE /v1/responses/{response_id}
POST   /v1/responses/{response_id}/cancel
GET    /v1/responses/{response_id}/input_items

Authentication

Authorization: Bearer <API_KEY>

Create Response

Generate AI responses with advanced conversational features.

Request Format

Endpoint: POST /v1/responses Headers:

Authorization: Bearer YOUR_API_KEY (required)
Content-Type: application/json (required)

Request Body:

{
  "model": "gpt-4o",
  "input": "Explain quantum computing",
  "previous_response_id": "resp_abc123",
  "prompt": {
    "id": "prompt_quantum_explanation",
    "variables": {
      "topic": "quantum computing",
      "difficulty": "beginner"
    },
    "version": "1"
  },
  "instructions": "You are a helpful physics tutor",
  "modalities": ["text"],
  "reasoning": true,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "calculate_quantum_state",
        "description": "Calculate quantum state probabilities",
        "parameters": {
          "type": "object",
          "properties": {
            "qubits": {"type": "integer"},
            "state": {"type": "string"}
          },
          "required": ["qubits", "state"]
        }
      }
    }
  ],
  "tool_choice": "auto",
  "temperature": 0.7,
  "max_tokens": 1500,
  "stream": false,
  "metadata": {
    "user_id": "user123",
    "session": "quantum_tutorial"
  }
}

Parameters

Field	Type	Required	Description
`model`	string	No	Model identifier
`prompt`	object	No	Prompt template parameters
`input`	string/array	No	Text or multimodal content
`previous_response_id`	string	No	ID for conversation continuity
`instructions`	string	No	System instructions
`modalities`	array	No	Output types: `["text"]`, `["text", "audio"]`
`reasoning`	boolean	No	Enable reasoning/thinking mode
`tools`	array	No	Available functions/tools
`tool_choice`	string/object	No	Tool selection: `auto`, `none`, `required`
`temperature`	float	No	Sampling temperature (0.0 to 2.0)
`max_tokens`	integer	No	Maximum output tokens
`stream`	boolean	No	Enable streaming response
`metadata`	object	No	Custom metadata

Prompt Input Format

{
  "prompt": {
    "id": "prompt_name",
    "version": "1",
    "variables": {
      "variable_1": "Value 1",
      "variable_2": "Value 2"
    }
  },
  "input": "Unstructured input text related to the prompt."
}

Multimodal Input Format

{
  "input": [
    {
      "type": "text",
      "text": "What's in this image?"
    },
    {
      "type": "image_url",
      "image_url": {
        "url": "data:image/jpeg;base64,..."
      }
    }
  ]
}

Response Format

The response contains an array-based output field with multiple item types:

{
  "id": "resp_abc123",
  "object": "response",
  "created": 1699123456,
  "model": "gpt-4o",
  "status": "completed",
  "output": [
    {
      "id": "msg_xyz",
      "type": "message",
      "status": "completed",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Quantum computing uses quantum mechanical phenomena...",
          "annotations": []
        }
      ]
    }
  ],
  "instructions": [
    {
      "type": "message",
      "role": "system",
      "status": "completed",
      "content": [
        {
          "type": "input_text",
          "text": "You are a helpful physics tutor"
        }
      ]
    },
    {
      "type": "message",
      "role": "user",
      "status": "completed",
      "content": [
        {
          "type": "input_text",
          "text": "Explain quantum computing"
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 25,
    "output_tokens": 150,
    "total_tokens": 175,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 0
    }
  },
  "parallel_tool_calls": true,
  "tool_choice": "auto",
  "tools": [],
  "temperature": 0.7,
  "top_p": 0.9,
  "max_output_tokens": 1500,
  "background": false,
  "reasoning": {},
  "text": {
    "format": {
      "type": "text"
    }
  }
}

Output Item Types

The output array can contain multiple types of items:

Text Messages

{
  "id": "msg_abc",
  "type": "message",
  "status": "completed",
  "role": "assistant",
  "content": [
    {
      "type": "output_text",
      "text": "Response content...",
      "annotations": [],
      "logprobs": []
    }
  ]
}

MCP Tool Lists

{
  "id": "mcpl_def",
  "type": "mcp_list_tools",
  "server_label": "filesystem",
  "tools": [
    {
      "name": "read_file",
      "description": "Read file contents",
      "input_schema": {
        "type": "object",
        "properties": {
          "path": {"type": "string"}
        }
      }
    }
  ],
  "error": null
}

MCP Tool Calls

{
  "id": "call_123",
  "type": "mcp_call",
  "status": "completed",
  "name": "read_file",
  "server_label": "filesystem",
  "arguments": "{\"path\":\"/data/file.txt\"}",
  "output": "File contents here...",
  "error": null
}

Function Tool Calls

{
  "type": "function_call",
  "call_id": "call_456",
  "name": "get_weather",
  "arguments": "{\"location\":\"Paris\"}",
  "id": "fc_789"
}

Reasoning Items

{
  "id": "rs_abc",
  "type": "reasoning",
  "status": "completed",
  "summary": [
    {
      "type": "summary_text",
      "text": "Let me think through this step by step..."
    }
  ]
}

Streaming Response Format

When streaming is enabled, responses are returned as Server-Sent Events (SSE) with the following format:

event: {event_type}
data: {json_payload}

Event Lifecycle

1. Initial Events

event: response.created
data: {"type":"response.created","sequence_number":0,"response":{"id":"resp_abc","status":"in_progress","created_at":1699123456}}

event: response.in_progress
data: {"type":"response.in_progress","sequence_number":1,"response":{"id":"resp_abc","status":"in_progress"}}

2. MCP Tool List Events (if MCP tools are configured)

event: response.output_item.added
data: {"type":"response.output_item.added","sequence_number":2,"output_index":0,"item":{"id":"mcpl_xyz","type":"mcp_list_tools","server_label":"filesystem","tools":[]}}

event: response.mcp_list_tools.in_progress
data: {"type":"response.mcp_list_tools.in_progress","sequence_number":3,"output_index":0,"item_id":"mcpl_xyz"}

event: response.mcp_list_tools.completed
data: {"type":"response.mcp_list_tools.completed","sequence_number":4,"output_index":0,"item_id":"mcpl_xyz"}

event: response.output_item.done
data: {"type":"response.output_item.done","sequence_number":5,"output_index":0,"item":{"id":"mcpl_xyz","type":"mcp_list_tools","server_label":"filesystem","tools":[{"name":"read_file","description":"Read file"}]}}

3. Text Output Events

event: response.output_item.added
data: {"type":"response.output_item.added","sequence_number":6,"output_index":1,"item":{"id":"msg_abc","type":"message","status":"in_progress","role":"assistant","content":[]}}

event: response.content_part.added
data: {"type":"response.content_part.added","sequence_number":7,"item_id":"msg_abc","output_index":1,"content_index":0,"part":{"type":"output_text","text":"","annotations":[]}}

event: response.output_text.delta
data: {"type":"response.output_text.delta","sequence_number":8,"item_id":"msg_abc","output_index":1,"content_index":0,"delta":"Quantum","logprobs":[]}

event: response.output_text.delta
data: {"type":"response.output_text.delta","sequence_number":9,"item_id":"msg_abc","output_index":1,"content_index":0,"delta":" computing","logprobs":[]}

event: response.output_text.done
data: {"type":"response.output_text.done","sequence_number":10,"item_id":"msg_abc","output_index":1,"content_index":0,"text":"Quantum computing uses...","logprobs":[]}

event: response.content_part.done
data: {"type":"response.content_part.done","sequence_number":11,"item_id":"msg_abc","output_index":1,"content_index":0,"part":{"type":"output_text","text":"Quantum computing uses...","annotations":[]}}

event: response.output_item.done
data: {"type":"response.output_item.done","sequence_number":12,"output_index":1,"item":{"id":"msg_abc","type":"message","status":"completed","role":"assistant","content":[{"type":"output_text","text":"Quantum computing uses..."}]}}

4. Reasoning Events (for thinking/reasoning models)

event: response.output_item.added
data: {"type":"response.output_item.added","sequence_number":13,"output_index":0,"item":{"id":"rs_xyz","type":"reasoning","status":"in_progress","summary":[]}}

event: response.reasoning_summary_part.added
data: {"type":"response.reasoning_summary_part.added","sequence_number":14,"item_id":"rs_xyz","output_index":0,"summary_index":0,"part":{"type":"summary_text","text":""}}

event: response.reasoning_summary_text.delta
data: {"type":"response.reasoning_summary_text.delta","sequence_number":15,"item_id":"rs_xyz","output_index":0,"summary_index":0,"delta":"Let me think..."}

event: response.reasoning_summary_text.done
data: {"type":"response.reasoning_summary_text.done","sequence_number":16,"item_id":"rs_xyz","output_index":0,"summary_index":0,"text":"Let me think through this..."}

event: response.reasoning_summary_part.done
data: {"type":"response.reasoning_summary_part.done","sequence_number":17,"item_id":"rs_xyz","output_index":0,"summary_index":0,"part":{"type":"summary_text","text":"Let me think through this..."}}

event: response.output_item.done
data: {"type":"response.output_item.done","sequence_number":18,"output_index":0,"item":{"id":"rs_xyz","type":"reasoning","status":"completed","summary":[{"type":"summary_text","text":"Let me think through this..."}]}}

5. MCP Tool Call Events

event: response.output_item.added
data: {"type":"response.output_item.added","sequence_number":19,"output_index":2,"item":{"id":"call_123","type":"mcp_call","status":"in_progress","name":"read_file","server_label":"filesystem","arguments":""}}

event: response.mcp_call.in_progress
data: {"type":"response.mcp_call.in_progress","sequence_number":20,"output_index":2,"item_id":"call_123"}

event: response.mcp_call_arguments.delta
data: {"type":"response.mcp_call_arguments.delta","sequence_number":21,"output_index":2,"item_id":"call_123","delta":"{\"path\"}"}

event: response.mcp_call_arguments.done
data: {"type":"response.mcp_call_arguments.done","sequence_number":22,"output_index":2,"item_id":"call_123","arguments":"{\"path\":\"/file.txt\"}"}

event: response.mcp_call.completed
data: {"type":"response.mcp_call.completed","sequence_number":23,"output_index":2,"item_id":"call_123"}

event: response.output_item.done
data: {"type":"response.output_item.done","sequence_number":24,"output_index":2,"item":{"id":"call_123","type":"mcp_call","status":"completed","name":"read_file","server_label":"filesystem","arguments":"{\"path\":\"/file.txt\"}","output":"file contents..."}}

6. Function Tool Call Events

event: response.output_item.added
data: {"type":"response.output_item.added","sequence_number":25,"output_index":3,"item":{"type":"function_call","call_id":"call_456","name":"get_weather","arguments":"","id":"fc_789"}}

event: response.function_call_arguments.done
data: {"type":"response.function_call_arguments.done","sequence_number":26,"output_index":3,"item_id":"call_456","name":"get_weather","arguments":"{\"location\":\"Paris\"}"}

event: response.output_item.done
data: {"type":"response.output_item.done","sequence_number":27,"output_index":3,"item":{"type":"function_call","call_id":"call_456","name":"get_weather","arguments":"{\"location\":\"Paris\"}","id":"fc_789"}}

7. Completion Event

event: response.completed
data: {"type":"response.completed","sequence_number":28,"response":{"id":"resp_abc","object":"response","created_at":1699123456,"model":"gpt-4","status":"completed","output":[...],"instructions":[...],"usage":{"input_tokens":25,"output_tokens":150,"total_tokens":175}}}

8. Error Event (on failure)

event: response.failed
data: {"type":"response.failed","sequence_number":5,"response":{"id":"resp_abc","status":"failed","error":{"message":"Error description","type":"server_error","code":"execution_failed"}}}

Key Event Fields

sequence_number: Monotonically increasing counter for event ordering
output_index: Position in the output array (0-indexed)
item_id: Unique identifier for the specific item being streamed
content_index: Position within the content array (for messages)
summary_index: Position within the summary array (for reasoning)

Prompt-Based Execution

Execute pre-configured prompt templates using the prompt parameter: Request Example:

{
  "prompt": {
    "id": "prompt_template_id",
    "variables": {"topic": "quantum computing"},
    "version": "1"
  },
  "input": "Unstructured user input"
}

Fields:

prompt.id (required) - Template identifier
prompt.variables (optional) - Variable substitutions
prompt.version (optional) - Template version (defaults to default version)
input (optional) - Unstructured user input

Prompt Configuration (via UI or API): Users can pre-configure prompts with:

Model deployment and settings (temperature, max_tokens, top_p, etc.)
System prompt with Jinja2 template support
Conversation messages and context with Jinja2 template support
MCP tools (filesystem, web access, custom tools)
Input/output schemas for structured data
Validation rules and retry limits
Streaming configuration

Retrieve Response

Get details of a specific response. Endpoint: GET /v1/responses/{response_id}

Response Format

Returns the same format as the create response endpoint.

Delete Response

Remove a response from the system. Endpoint: DELETE /v1/responses/{response_id}

Response Format

{
  "id": "resp_abc123",
  "object": "response",
  "deleted": true
}

Cancel Response

Cancel an in-progress response generation. Endpoint: POST /v1/responses/{response_id}/cancel

Response Format

{
  "id": "resp_abc123",
  "object": "response",
  "status": "cancelled",
  "cancelled_at": 1699123456
}

List Input Items

Retrieve the input conversation history for a response. Endpoint: GET /v1/responses/{response_id}/input_items

Response Format

{
  "object": "list",
  "data": [
    {
      "type": "message",
      "role": "system",
      "content": "You are a helpful physics tutor"
    },
    {
      "type": "message",
      "role": "user",
      "content": "Explain quantum computing"
    },
    {
      "type": "message",
      "role": "assistant",
      "content": "I'd be happy to explain quantum computing..."
    }
  ]
}

Usage Examples

Basic Response

curl -X POST http://localhost:3000/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "input": "What is machine learning?"
  }'

Prompt Execution

curl -X POST http://localhost:3000/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": {
      "id": "prompt_neural_networks",
      "variables": {
        "question": "Explain neural networks"
      }
    }
  }'

Multi-turn Conversation

# First response
RESPONSE_ID=$(curl -X POST http://localhost:3000/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "input": "Explain neural networks"
  }' | jq -r '.id')

# Follow-up response
curl -X POST http://localhost:3000/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "input": "How do they differ from traditional algorithms?",
    "previous_response_id": "'$RESPONSE_ID'"
  }'

With Tool Calling

curl -X POST http://localhost:3000/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "input": "Calculate the fibonacci sequence up to 10",
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "calculate_fibonacci",
          "description": "Calculate fibonacci numbers",
          "parameters": {
            "type": "object",
            "properties": {
              "n": {"type": "integer", "description": "Number of terms"}
            },
            "required": ["n"]
          }
        }
      }
    ]
  }'

Python Example

import requests
import json

class ResponsesAPI:
    def __init__(self, api_key, base_url="http://localhost:3000"):
        self.api_key = api_key
        self.base_url = base_url
        self.headers = {
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json"
        }

    def create_response(self, model, input_text, **kwargs):
        data = {
            "model": model,
            "input": input_text,
            **kwargs
        }

        response = requests.post(
            f"{self.base_url}/v1/responses",
            headers=self.headers,
            json=data
        )
        return response.json()

    def create_conversation(self, model, messages):
        """Create a multi-turn conversation"""
        response_id = None
        responses = []

        for message in messages:
            data = {
                "model": model,
                "input": message
            }

            if response_id:
                data["previous_response_id"] = response_id

            response = self.create_response(model, message,
                                          previous_response_id=response_id)
            responses.append(response)
            response_id = response["id"]

        return responses

    def stream_response(self, model, input_text, **kwargs):
        """Stream response with SSE"""
        data = {
            "model": model,
            "input": input_text,
            "stream": True,
            **kwargs
        }

        response = requests.post(
            f"{self.base_url}/v1/responses",
            headers=self.headers,
            json=data,
            stream=True
        )

        for line in response.iter_lines():
            if line:
                line = line.decode('utf-8')
                if line.startswith('data: '):
                    data = line[6:]
                    if data == '[DONE]':
                        break
                    yield json.loads(data)

# Usage
api = ResponsesAPI("YOUR_API_KEY")

# Simple response
response = api.create_response(
    "gpt-4o",
    "Explain the theory of relativity",
    temperature=0.7
)
print(response["output"]["content"])

# Multi-turn conversation
conversation = api.create_conversation(
    "gpt-4o",
    [
        "What is artificial intelligence?",
        "How does it relate to machine learning?",
        "What are some practical applications?"
    ]
)

# Streaming response
for chunk in api.stream_response("gpt-4o", "Write a short story"):
    if "delta" in chunk and "content" in chunk["delta"]:
        print(chunk["delta"]["content"], end="", flush=True)

# Multimodal input
with open("image.jpg", "rb") as f:
    import base64
    image_data = base64.b64encode(f.read()).decode()

    response = api.create_response(
        "gpt-4o-vision",
        [
            {"type": "text", "text": "Describe this image"},
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_data}"}}
        ]
    )

JavaScript Example

class ResponsesAPI {
  constructor(apiKey, baseUrl = 'http://localhost:3000') {
    this.apiKey = apiKey;
    this.baseUrl = baseUrl;
  }

  async createResponse(model, input, options = {}) {
    const response = await fetch(`${this.baseUrl}/v1/responses`, {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${this.apiKey}`,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({
        model,
        input,
        ...options
      })
    });

    return await response.json();
  }

  async *streamResponse(model, input, options = {}) {
    const response = await fetch(`${this.baseUrl}/v1/responses`, {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${this.apiKey}`,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({
        model,
        input,
        stream: true,
        ...options
      })
    });

    const reader = response.body.getReader();
    const decoder = new TextDecoder();
    let buffer = '';

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      buffer += decoder.decode(value, { stream: true });
      const lines = buffer.split('\n');
      buffer = lines.pop();

      for (const line of lines) {
        if (line.startsWith('data: ')) {
          const data = line.slice(6);
          if (data === '[DONE]') return;
          yield JSON.parse(data);
        }
      }
    }
  }

  async createConversation(model, messages) {
    let responseId = null;
    const responses = [];

    for (const message of messages) {
      const response = await this.createResponse(
        model,
        message,
        responseId ? { previous_response_id: responseId } : {}
      );

      responses.push(response);
      responseId = response.id;
    }

    return responses;
  }
}

// Usage
const api = new ResponsesAPI('YOUR_API_KEY');

// Simple response
const response = await api.createResponse(
  'gpt-4o',
  'What is the meaning of life?',
  { temperature: 0.9 }
);
console.log(response.output.content);

// Streaming response
for await (const chunk of api.streamResponse('gpt-4o', 'Tell me a joke')) {
  if (chunk.delta?.content) {
    process.stdout.write(chunk.delta.content);
  }
}

// Tool calling
const toolResponse = await api.createResponse(
  'gpt-4o',
  'What is the weather in Paris?',
  {
    tools: [{
      type: 'function',
      function: {
        name: 'get_weather',
        description: 'Get weather information',
        parameters: {
          type: 'object',
          properties: {
            location: { type: 'string' }
          },
          required: ['location']
        }
      }
    }]
  }
);

// Handle tool calls
if (toolResponse.output.tool_calls?.length > 0) {
  for (const toolCall of toolResponse.output.tool_calls) {
    console.log(`Calling ${toolCall.function.name} with:`,
                JSON.parse(toolCall.function.arguments));
  }
}

Advanced Features

Reasoning Models

Enable step-by-step reasoning:

{
  "model": "o1-preview",
  "input": "Solve this complex problem...",
  "reasoning": true
}

Parallel Tool Calling

The API supports calling multiple tools in parallel:

{
  "output": {
    "tool_calls": [
      {
        "id": "call_1",
        "type": "function",
        "function": {
          "name": "get_weather",
          "arguments": "{\"location\": \"Paris\"}"
        }
      },
      {
        "id": "call_2",
        "type": "function",
        "function": {
          "name": "get_weather",
          "arguments": "{\"location\": \"London\"}"
        }
      }
    ]
  }
}

Conversation Context

Maintain context across multiple interactions:

{
  "model": "gpt-4o",
  "input": "Continue our discussion",
  "previous_response_id": "resp_previous",
  "instructions": "You are a helpful tutor who remembers previous conversations"
}

Error Responses

400 Bad Request

{
  "error": {
    "message": "Invalid model specified",
    "type": "invalid_request_error",
    "code": "invalid_model"
  }
}

401 Unauthorized

{
  "error": {
    "message": "Invalid API key",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}

429 Rate Limit

{
  "error": {
    "message": "Rate limit exceeded",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded",
    "retry_after": 60
  }
}

Best Practices

Conversation Management: Use previous_response_id for coherent multi-turn conversations
Tool Design: Create focused, single-purpose tools for better reliability
Streaming: Use streaming for long responses to improve user experience
Error Handling: Implement robust retry logic for transient failures
Metadata: Use metadata to track conversations and user sessions
Context Window: Be mindful of token limits when building long conversations
Parallel Tools: Leverage parallel tool calling for independent operations
Prompt Templates: Design reusable prompt templates with clear variable names for maintainability
Variable Management: Use descriptive variable names and provide defaults where appropriate
Version Control: Use prompt versioning to iterate on prompts without breaking existing integrations

Limitations

Some advanced retrieval features may not be fully implemented
Response management endpoints have limited functionality
Conversation history is maintained only through previous_response_id chaining
Maximum context window depends on the model used
Prompt templates must be pre-configured before use
Maximum context window depends on the model used, including when configured in a prompt template.

Supported providers

OpenAI

Next-generation Responses API with full support for advanced conversational features, multi-turn interactions, and parallel tool calling.

Self-Hosting

Development

OpenAI-Compatible API

Features

Cluster Setup

​Overview

​Endpoints

​Authentication

​Create Response

​Request Format

​Parameters

​Prompt Input Format

​Multimodal Input Format

​Response Format

​Output Item Types

​Text Messages

​MCP Tool Lists

​MCP Tool Calls

​Function Tool Calls

​Reasoning Items

​Streaming Response Format

​Event Lifecycle

​Key Event Fields

​Prompt-Based Execution

​Retrieve Response

​Response Format

​Delete Response

​Response Format

​Cancel Response

​Response Format

​List Input Items

​Response Format

​Usage Examples

​Basic Response

​Prompt Execution

​Multi-turn Conversation

​With Tool Calling

​Python Example

​JavaScript Example

​Advanced Features

​Reasoning Models

​Parallel Tool Calling

​Conversation Context

​Error Responses

​400 Bad Request

​401 Unauthorized

​429 Rate Limit

​Best Practices

​Limitations

​Supported providers

OpenAI

Overview

Endpoints

Authentication

Create Response

Request Format

Parameters

Prompt Input Format

Multimodal Input Format

Response Format

Output Item Types

Text Messages

MCP Tool Lists

MCP Tool Calls

Function Tool Calls

Reasoning Items

Streaming Response Format

Event Lifecycle

Key Event Fields

Prompt-Based Execution

Retrieve Response

Response Format

Delete Response

Response Format

Cancel Response

Response Format

List Input Items

Response Format

Usage Examples

Basic Response

Prompt Execution

Multi-turn Conversation

With Tool Calling

Python Example

JavaScript Example

Advanced Features

Reasoning Models

Parallel Tool Calling

Conversation Context

Error Responses

400 Bad Request

401 Unauthorized

429 Rate Limit

Best Practices

Limitations

Supported providers