Pick the engine: model, models, or preset
Each run needs an engine. There are three ways to set one, and at least one is required:
| Field | What it is |
|---|---|
model | A single model in provider/model format, for example openai/gpt-5.5. |
models | A fallback chain of up to 5 models, tried in order until one succeeds. |
preset | A named, pre-tuned bundle: model, system prompt, search config, and tools. |
models, it takes precedence over model. A preset gives you a working agent in one field. Override individual settings on top of it as needed.
Set the standing rules: instructions
instructions are the system rules that hold for every turn of the agent loop — role, tone, and grounding rules. They apply regardless of what the user asks on a given turn.
preset ships its own tuned system prompt. Setting instructions replaces that prompt rather than appending to it; omit instructions to keep the preset’s prompt. Keep instructions to durable rules — per-question framing belongs in input, which the next page covers.
Keep
instructions lean. Every token here is re-processed on each step of the loop, so it adds up across tool calls. For machine-readable output or retrieval constraints, prefer request parameters (see Shape the output and Give it tools) over prose rules.Bound the loop: max_steps
An agent run is a loop: the model reasons, optionally calls tools, reads the results, and repeats until it answers. One step is one pass through that cycle — a single model turn that may call tools. max_steps caps how many steps a run may take. Use it to bound runaway loops and to trade latency against depth. When a run reaches the cap, the agent doesn’t error — it makes one final pass to answer from what it has gathered so far.
| You want | Set |
|---|---|
| One pass — a tool can still run, but no looping on its results | max_steps: 1 |
| A few rounds of search-and-refine | max_steps: 3–5 |
| Deep, multi-step research | max_steps: 10+ |
max_steps doesn’t disable tools; it limits how many times the agent can act on what they return. At max_steps: 1 the model still gets one turn and can call a tool — the run just won’t loop back to reason over the result.
If you pass max_steps alongside a preset, it overrides the preset’s value, up to the preset’s own ceiling for chat-style presets. To cap raw generation length rather than loop iterations, set max_output_tokens instead.
Run it in the background
Long agent runs don’t have to block. Setbackground: true to submit the run and retrieve it later by ID with GET /v1/responses/{id} — useful for deep research, sandbox work, or any run that may take minutes. Background and streaming runs are covered in Shape the output.
Next steps
Prompt the agent
Split standing rules from the question, and ground the run.
Give it tools
Enable built-in and function tools, then read their results.
Presets
See the tuned presets you can start from.
Models
Browse the model catalog and token pricing.