HTML to Image API for AI Agents

Use an HTML to image API for AI agents to render generated HTML into PNG, JPEG, or WebP without running a browser.

Last updated: 2026-06-28

Try ScreenshotAPI free

200 free screenshots/month. No credit card required.

An AI agent can turn its own generated HTML into a shareable image by sending the markup to an HTML to image API for AI agents, which renders it in a real Chrome browser in the cloud and returns a PNG, JPEG, or WebP. The agent never installs Chromium, manages a browser pool, or runs any rendering infrastructure: it builds self-contained HTML, sends one authenticated POST, and saves the binary response as an artifact. This guide shows the exact request, an agent prompt pattern, a tool schema, and the rules that keep rendering deterministic and safe.

When to render HTML from an agent

Render HTML when the agent already holds the content and there is no live page to visit. Capturing a live URL is a different job covered in the URL to image API for AI agents guide — use that when the visual already exists at an address.

Good fits for HTML rendering from an agent:

Open Graph and social cards generated from a title, summary, or campaign copy.
Email preview images so a human can approve a draft before it sends.
Report and summary snapshots posted to Slack, email, or issue comments.
Invoices, receipts, and certificates built from structured fields.
Lightweight dashboards and status boards rendered from data the agent already has.

In every case the agent owns the markup, so it can make the HTML deterministic: inline CSS, a fixed width and height, no external scripts, and bounded content length.

Raw HTML request

HTML rendering requires POST with a JSON body containing an html field. A GET request with html is rejected with Use POST /api/v1/screenshot for html rendering. On POST you provide either url or html. The response body is the binary image.

The example below renders a 1200x630 Open Graph card with inline CSS — the canonical agent output shape.


bash
curl "https://screenshotapi.to/api/v1/screenshot" \
  -X POST \
  -H "content-type: application/json" \
  -H "x-api-key: sk_live_your_key_here" \
  --data '{
    "html": "<!doctype html><html><body style=\"margin:0;width:1200px;height:630px;display:flex;flex-direction:column;justify-content:center;padding:80px;box-sizing:border-box;font-family:-apple-system,Segoe UI,Roboto,sans-serif;background:linear-gradient(135deg,#0f172a,#1e3a8a);color:#fff\"><div style=\"font-size:20px;letter-spacing:2px;opacity:.7\">WEEKLY REPORT</div><h1 style=\"font-size:64px;line-height:1.1;margin:16px 0 0\">All 42 checks passed</h1><p style=\"font-size:28px;opacity:.85;margin-top:24px\">Generated automatically by the release agent</p></body></html>",
    "type": "png",
    "width": 1200,
    "height": 630
  }' \
  --output og-card.png

JavaScript example

For agents running on Node or Bun, send the same body with fetch. Authenticate with the x-api-key header (preferred) or Authorization: Bearer sk_live_....


javascript
const html = `<!doctype html><html><body style="margin:0;width:1200px;height:630px;display:flex;flex-direction:column;justify-content:center;padding:80px;box-sizing:border-box;font-family:-apple-system,Segoe UI,Roboto,sans-serif;background:linear-gradient(135deg,#0f172a,#1e3a8a);color:#fff"><div style="font-size:20px;letter-spacing:2px;opacity:.7">WEEKLY REPORT</div><h1 style="font-size:64px;line-height:1.1;margin:16px 0 0">All 42 checks passed</h1><p style="font-size:28px;opacity:.85;margin-top:24px">Generated automatically by the release agent</p></body></html>`;

const response = await fetch('https://screenshotapi.to/api/v1/screenshot', {
  method: 'POST',
  headers: {
    'content-type': 'application/json',
    'x-api-key': process.env.SCREENSHOTAPI_KEY
  },
  body: JSON.stringify({ html, type: 'png', width: 1200, height: 630 })
});

if (!response.ok) {
  const { error, message, fix } = await response.json();
  throw new Error(`${error}: ${message}${fix ? ` (fix: ${fix})` : ''}`);
}

await Bun.write('og-card.png', await response.arrayBuffer());

Python example

The requests library serializes the body when you pass json=, which also sets the content-type header.


python
import requests

html = """<!doctype html><html><body style="margin:0;width:1200px;height:630px;display:flex;flex-direction:column;justify-content:center;padding:80px;box-sizing:border-box;font-family:-apple-system,Segoe UI,Roboto,sans-serif;background:linear-gradient(135deg,#0f172a,#1e3a8a);color:#fff"><div style="font-size:20px;letter-spacing:2px;opacity:.7">WEEKLY REPORT</div><h1 style="font-size:64px;line-height:1.1;margin:16px 0 0">All 42 checks passed</h1><p style="font-size:28px;opacity:.85;margin-top:24px">Generated automatically by the release agent</p></body></html>"""

response = requests.post(
    "https://screenshotapi.to/api/v1/screenshot",
    headers={"x-api-key": "sk_live_your_key_here"},
    json={"html": html, "type": "png", "width": 1200, "height": 630},
)
response.raise_for_status()

with open("og-card.png", "wb") as f:
    f.write(response.content)

Agent prompt pattern

When you delegate the HTML generation to a model, constrain the output so it renders predictably. Drop this into the system or task instructions:


text
You are rendering a 1200x630 image from HTML.

Output requirements:
- Return ONE self-contained HTML document. No external stylesheets, fonts, or scripts.
- Use inline CSS only. Set the root element to width:1200px; height:630px; box-sizing:border-box.
- Keep content bounded: one headline, one short supporting line, no long paragraphs.
- Do NOT include API keys, tokens, passwords, customer PII, or any private data.

Then call ScreenshotAPI:
- POST https://screenshotapi.to/api/v1/screenshot
- JSON body: { "html": <your HTML>, "type": "png", "width": 1200, "height": 630 }
- Header: x-api-key: <key>

Save the binary response as the task artifact. Do not paste image bytes back into the conversation.

Recommended tool schema

Expose a narrow tool to your agent framework so the model only controls what it should. The endpoint is the same for all callers.


json
{
  "name": "render_html_image",
  "description": "Render a self-contained HTML document to a PNG, JPEG, or WebP image. Use for agent-generated content (cards, reports, invoices) when there is no live URL to capture.",
  "parameters": {
    "html": "Self-contained HTML document with inline CSS and no external scripts",
    "type": "png, jpeg, or webp",
    "width": "Viewport width in pixels, 1-1920 (e.g. 1200)",
    "height": "Viewport height in pixels, 1-10000 (e.g. 630)",
    "quality": "Compression quality 1-100, jpeg and webp only",
    "preloadFonts": "Set true only when the design depends on web fonts"
  }
}

Point GPT Actions and other OpenAPI-compatible tool builders at /openapi.json. Point coding assistants at /llms-full.txt for the full agent-readable documentation corpus.

Reading the response (agent DX)

A successful render returns the raw image bytes plus headers an agent can use to stay within limits and to log results:

x-credits-remaining — remaining quota, so the agent can self-limit before a 402.
x-cache — HIT or MISS; cached responses do not count against quota.
x-screenshot-id — stable id for the render, useful in logs and issue comments.
x-duration-ms — server render time.

Error bodies are JSON. A 400 or 402 returns { error, message }; a 500 adds a fix field with a remediation hint the agent can act on. Check the status before reading bytes:


javascript
const response = await fetch('https://screenshotapi.to/api/v1/screenshot', {
  method: 'POST',
  headers: { 'content-type': 'application/json', 'x-api-key': process.env.SCREENSHOTAPI_KEY },
  body: JSON.stringify({ html, type: 'png', width: 1200, height: 630 })
});

if (!response.ok) {
  const { error, message, fix } = await response.json();
  // `fix` is present on 500s and tells the agent how to recover
  throw new Error(`${error}: ${message}${fix ? ` (fix: ${fix})` : ''}`);
}

const remaining = Number(response.headers.get('x-credits-remaining'));
if (remaining < 10) {
  // back off or notify before exhausting quota
}

await Bun.write('og-card.png', await response.arrayBuffer());

Deterministic rendering and safety

The agent controls the markup, so make it boring and repeatable:

Use inline CSS. External stylesheets add network dependencies that can fail or change between renders.
Set explicit width and height (1200x630 for social cards) so layout never reflows.
Enable preloadFonts only when the design depends on a specific web font; skip it otherwise for faster, more predictable output.
Never include secrets, access tokens, passwords, PII, or untrusted third-party scripts in the HTML — the image and its cached copy may be shared.
Keep HTML generation separate from capture. Log the generated HTML, then send it; that makes a bad render easy to reproduce and debug.
Bound the content length so a verbose model cannot overflow the fixed viewport.

URL capture vs HTML rendering

Both modes hit the same endpoint; the difference is the input and the HTTP method.

	URL capture	HTML rendering
Input	`url` (a live page)	`html` (agent-generated markup)
Method	`GET` or `POST`	`POST` only
Best for	Existing pages, deployed UIs, dashboards	Cards, reports, invoices, status boards
Content source	Lives at an address	Produced by the agent
Auth, formats, headers	Identical	Identical

If the visual already exists at a URL, capture it — see the URL to image API for AI agents guide.

Use it from your stack

ScreenshotAPI plugs into the same places agents already live:

MCP — connect the ScreenshotAPI MCP server so MCP-aware clients (Claude, Cursor, VS Code) can render HTML as a built-in tool.
OpenAPI — import /openapi.json into GPT Actions or any OpenAPI tool builder to generate the render_html_image action automatically.
Frameworks — wrap the POST call as a tool in LangChain, the Vercel AI SDK, or the OpenAI Agents SDK using the schema above.

Full setup notes for each client live in the AI-agent setup guide.

Related guides

URL to PDF API for AI Agents — render HTML or pages to PDF artifacts.
Capture Visual QA Evidence with AI Agents — screenshots as test and review evidence.
How to Convert HTML to Image — browser, server, and API approaches compared.
AI Agent Integrations — every supported client and framework.
Screenshot API Reference — all parameters, responses, and error states.

Frequently asked questions

Can an AI agent render raw HTML to an image?

Yes. Send POST /api/v1/screenshot with a JSON body containing an html field and an image type such as png, jpeg, or webp. HTML rendering requires POST; a GET request with html is rejected with the message 'Use POST /api/v1/screenshot for html rendering.' The response is the binary image body.

What is the best way to make agent HTML render deterministically?

Constrain the agent to self-contained HTML with inline CSS, no external scripts, and bounded content. Set an explicit width and height (1200x630 is a good default for social cards) so layout never reflows unexpectedly. Only enable preloadFonts when the design depends on a specific web font, since extra network work makes output less predictable.

When should an agent render HTML instead of capturing a live URL?

Render HTML when the agent already produced the content and there is no page to visit, such as a summary card, an invoice, or a status board built from data. Capture a live URL when the visual already exists at an address, for example a deployed page or a dashboard. See the companion guide on the URL to image API for AI agents for that path.

What should agents avoid putting in HTML screenshots?

Never include API keys, access tokens, passwords, personally identifiable information, or other secrets, because the rendered image and any cached copy can be shared as an artifact. Do not inject untrusted third-party scripts. Keep generation separate from capture so the HTML can be logged and reviewed safely.

Which image and document formats does the API return?

Set type to png (the default), jpeg, webp, or pdf. quality (1-100) applies only to jpeg and webp. The same POST endpoint that renders HTML to an image also renders HTML to a PDF when type=pdf, which is useful for invoices and receipts.

How does an agent know if a render succeeded and how much quota is left?

Check response.ok and read the response headers. x-credits-remaining lets an agent self-limit before hitting the quota, x-cache reports HIT or MISS, and x-screenshot-id identifies the render for logs. On a 500 the JSON error body includes a fix field with a remediation hint the agent can act on.

Start capturing screenshots today

Create a free account and get 200 free screenshots per month to try the API. No credit card required.

Create free account View pricing