Define JSON Schema extraction schemas for Skyvern tasks to get consistent, typed output. Build schemas for single objects or arrays of items with the interactive schema builder.
By default, Skyvern returns extracted data in whatever format makes sense for the task.
Pass a data_extraction_schema to enforce a specific structure using JSON Schema.
Add data_extraction_schema parameter to your task with a JSON Schema object:
result = await client.run_task( prompt="Get the title of the top post", url="https://news.ycombinator.com", data_extraction_schema={ "type": "object", "properties": { "title": { "type": "string", "description": "The title of the top post" } } })
The description field in each property helps Skyvern understand what data to extract. Be specific.
description fields drive extraction quality. Vague descriptions like “the data” produce vague results. Be specific: “The product price in USD, without currency symbol.”
Extract multiple items with the same structure, such as the top posts from a news site:
result = await client.run_task( prompt="Get the top 5 posts", url="https://news.ycombinator.com", data_extraction_schema={ "type": "object", "properties": { "posts": { "type": "array", "description": "Top 5 posts from the front page", "items": { "type": "object", "properties": { "title": { "type": "string", "description": "Post title" }, "points": { "type": "integer", "description": "Number of points" }, "url": { "type": "string", "description": "Link to the post" } } } } } })
Output (when completed):
{ "posts": [ { "title": "Running Claude Code dangerously (safely)", "points": 342, "url": "https://blog.emilburzo.com/2026/01/running-claude-code-dangerously-safely/" }, { "title": "Linux kernel framework for PCIe device emulation", "points": 287, "url": "https://github.com/cakehonolulu/pciem" }, { "title": "I'm addicted to being useful", "points": 256, "url": "https://www.seangoedecke.com/addicted-to-being-useful/" }, { "title": "Level S4 solar radiation event", "points": 198, "url": "https://www.swpc.noaa.gov/news/g4-severe-geomagnetic-storm" }, { "title": "WebAssembly Text Format parser performance", "points": 176, "url": "https://blog.gplane.win/posts/improve-wat-parser-perf.html" } ]}
Arrays without limits extract everything visible on the page. Specify limits in your prompt (e.g., “top 5 posts”) or the array description to control output size.