Skip to main content
Within a single request the agent does a lot on its own — it runs an internal loop, calls tools (often several in parallel), reads the results, and iterates until it can answer. But all of that work lives inside that one run. Across requests, the Agent API keeps no state: each call sees only its own payload, so nothing from a previous response carries over automatically. To make a follow-up build on an earlier turn, you replay the prior conversation yourself in the input array — the user’s earlier messages, the assistant’s replies, and any function-call round-trips all go back in on the next request. This page shows how to chain turns that way.
The Agent API accepts an OpenAI-compatible previous_response_id field, but it is currently a no-op: it is not yet supported for server-side response continuation. Until it is, carry context yourself by replaying the conversation in input, as shown below.

Chain turns by replaying input

Send the next turn as an input array that includes the prior turns. Each turn is a message item with a role (user or assistant). Append the new question at the end.
from perplexity import Perplexity

client = Perplexity()

first = client.responses.create(
    model="openai/gpt-5.5",
    input="Who won the 2022 FIFA World Cup?",
    tools=[{"type": "web_search"}],
)
print(first.output_text)

# Replay the conversation so far, then add the follow-up.
second = client.responses.create(
    model="openai/gpt-5.5",
    input=[
        {"role": "user", "content": "Who won the 2022 FIFA World Cup?"},
        {"role": "assistant", "content": first.output_text},
        {"role": "user", "content": "Who did they beat in the final, and what was the score?"},
    ],
)
print(second.output_text)
For better prompt-cache affinity across a long chain, pass a stable prompt_cache_key on each request. It helps the backend reuse the shared prefix instead of reprocessing the whole replayed conversation each turn.

Next steps

Give it vision

Attach images to a request.

Shape the output

Stream output as it arrives and enforce structured JSON.