Skip to main content

Documentation Index

Fetch the complete documentation index at: https://budecosystem-b7b14df4.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Web Fetch retrieves a single URL on behalf of the agent and returns its content as markdown so the LLM can read it directly. Built-in SSRF protection blocks requests to private IP ranges by default, and per-prompt allow/block lists let you constrain the tool to specific domains.

When to Use It

  • You have a URL — from Web Search, a user message, or a known reference — and need its content as readable text.
  • You want consistent markdown output regardless of source HTML quality.
  • You need a safe default: SSRF protection prevents the agent from being tricked into hitting your internal network.

Configuration

FieldTypeDefaultDescription
max_content_lengthinteger50000Truncate the returned markdown to this many characters. null disables the cap.
allow_local_urlsbooleanfalseAllow requests to private/local IP ranges. Leave off in production.
timeoutinteger30HTTP request timeout in seconds.
allowed_domainsarraynullWhitelist of exact hostnames. When set, only these hosts are reachable.
blocked_domainsarraynullBlacklist of exact hostnames. Takes effect even when the URL would otherwise be allowed.
headersobjectnullExtra HTTP headers merged with the tool’s defaults.
allowed_domains and blocked_domains perform exact hostname matchingexample.com does not match docs.example.com. List each hostname you want to permit or deny explicitly. The default max_content_length of 50 000 characters maps to roughly 12 500 tokens at four characters per token — enough for most articles while keeping prompt costs predictable.

Output Shape

For text responses (HTML, plain text), the tool returns three fields:
  • url — the final URL after redirects
  • title — the page title, when present
  • content — markdown rendering of the page body, truncated to max_content_length
For binary responses (PDFs, images), the tool returns the content type and the raw bytes for downstream handling.

Security Model

SSRF protection. By default, the tool refuses to fetch URLs that resolve to private IP ranges:
  • 127.0.0.0/8, 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16 (IPv4 private)
  • ::/1 and fe80::/10 (IPv6 loopback and link-local)
This protects internal services that share the egress path with the agent. Enable allow_local_urls only when you have a deliberate reason — e.g. a sandboxed test environment where private endpoints are intentional targets. Domain enforcement. When allowed_domains is set, the tool operates in whitelist mode and rejects any URL whose hostname is not in the list. blocked_domains adds a denylist that applies on top of allowlisting and the default SSRF rules. Failure mode. All security and network errors surface to the LLM as retry signals — the model sees the failure as a tool error and can decide to try a different URL, ask the user, or give up. There is no silent fallback.

Operational Considerations

  • No persistent cache — every call hits the upstream URL.
  • Robots/legal compliance — fetching is not stateful; respect site terms in the deployments and prompts you ship to users.
  • Cost shape — the cost driver is downstream LLM context: a 50 000-character page pushed verbatim into the model is much more expensive than a 5 000-character extract.
  • Header overrides — be careful when adding User-Agent or auth headers via headers — those values are sent on every call, including to URLs the model may discover from other tools.
  • Content trust — markdown returned to the LLM is untrusted. Combine with guardrails when the agent is autonomous and the URL space is open.

Next Steps

Web Search

Discover candidate URLs before fetching them

Code Interpreter

For fetches that need structured parsing or further computation