Free-form text is great for humans and a nightmare for software. Constrained generation forces a model's output into a precise shape — valid JSON, regex-matching strings, schema-validated objects — so downstream code can parse it without defensive heroics. This is what turns a chat model into a real building block of a system.
The first time you try to plug an LLM into a real pipeline, you discover the same lesson developers learned about web scraping in the 2000s: parsing free-text is hell. The model puts the JSON inside a markdown code fence on Monday, forgets the trailing brace on Tuesday, slips in a friendly "Sure, here's the JSON:" on Wednesday. Every shape change breaks your parser.
Constrained generation fixes this by either (a) using API-level features that guarantee the output matches a schema, or (b) writing prompts so disciplined that schema violations become rare enough to handle as exceptions. Most production systems use both.
There are four levels of "constrained" you can apply, in increasing order of strength:
Prose-only format request
Extract the customer's name, intent, and urgency from
the message below. Return as JSON.
Message: """I'd like to cancel my account please.
The product is broken and I've waited three days."""
On a typical week of production traffic you will see all of these variations: the model wraps the JSON in ```json … ```, adds a preface like "Here's the JSON you asked for:", drops a trailing comma, uses "urgency": "high" on one call and "urgency": "High" on another. Your parser breaks at 2 a.m.
Schema + structured output API
// JSON Schema passed to the API alongside the prompt
{
"name": "support_ticket",
"schema": {
"type": "object",
"properties": {
"customer_name": { "type": "string" },
"intent": {
"type": "string",
"enum": ["cancel", "complaint", "question", "compliment"]
},
"urgency": {
"type": "string",
"enum": ["low", "normal", "high", "critical"]
},
"needs_human": { "type": "boolean" }
},
"required": ["customer_name", "intent", "urgency", "needs_human"],
"additionalProperties": false
},
"strict": true
}
// Prompt
You are a support triage classifier. Extract the fields
defined in the schema. If the customer's name is missing,
use "unknown". Never guess — pick the closest enum value.
Message: """I'd like to cancel my account please.
The product is broken and I've waited three days."""
The API guarantees the output is valid against the schema. urgency can only be one of the four enum values. needs_human is always a boolean. Your downstream code parses it with JSON.parse and immediately uses the values — no defensive coding required.
customer_name can be schema-valid but obviously wrong. Validate after parsing.Tip: When the API doesn't support structured outputs, the next best thing is "JSON between fences" — instruct the model to emit only
<json>...</json>blocks and parse what is between the tags. This catches the most common failure (prefatory chatter) without needing any special API support.
Pick a task that produces structured output (extraction, classification, routing) and write the JSON Schema before the prompt. Then write the prompt to match. Run it on 30 real inputs and measure the schema-validity rate.
Compare three implementations of the same extraction task: (a) prose-only format request, (b) few-shot prompt with examples, (c) full structured-output API. Measure validity rate and downstream accuracy. The cost difference is usually negligible; the reliability difference is huge.
Build a tiny retry loop: on schema-validation failure, automatically retry once with the validation error appended to the prompt ("Your previous response failed validation: missing required field 'urgency'. Try again."). Measure how often the retry succeeds. This is a cheap and effective safety net.
Sign in to join the discussion and post comments.
Sign inPrompt Engineering for Content & Copywriting
Write blogs, ads, emails, and social media content ten times faster with AI. 13 practical tutorials on prompt engineering for content creators and copywriters.
Prompt Engineering for Education & Learning
Use AI as your personal tutor. Learn how to study faster, create lesson plans, generate practice questions, master languages, and prepare for competitive exams with smart prompts.
Prompt Engineering Projects & Real-World Applications
Twelve hands-on projects that turn prompt engineering theory into a portfolio. Build chatbots, content generators, RAG systems, and more.
Prompt Engineering for Developers
Use AI as your coding co-pilot. 18 tutorials on writing prompts to generate clean code, debug faster, write tests, build APIs, and ship better software.
Foundations of Prompt Engineering
The must-know basics of prompt engineering. Learn what prompts are, how AI models read them, and how to write clear instructions that get great results.
Prompt Engineering for Data Science & Analytics
Supercharge your data workflows with AI. 15 practical tutorials on using prompt engineering for data cleaning, EDA, machine learning, SQL, visualisation, and more.