Creates a new model response for the given input and model.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Model deployment to use for response creation.
"gpt-5.2"
Input content for the model. Can be a simple text string or an array of structured message objects.
"Hello! How are you?"
System/developer message inserted into the model context to guide behavior, tone, and goals.
ID of a previous response to use for multi-turn conversations and context continuation.
Reference to a prompt template with variables.
Upper bound for the number of tokens that can be generated (includes reasoning tokens).
x >= 1Maximum total number of built-in tool calls allowed in the response.
Sampling temperature between 0 and 2. Higher values make output more random, lower values make it more focused and deterministic.
0 <= x <= 20.7
Nucleus sampling parameter. Model considers tokens with top_p probability mass. For example, 0.1 means only tokens comprising the top 10% probability mass are considered.
0 <= x <= 11
Number of most likely tokens to return at each token position (0-20).
0 <= x <= 20Allow the model to run tool calls in parallel during tool use.
If true, the response will be streamed back using server-sent events.
If true, the response will run in the background. Use GET /v1/responses/{response_id} to poll for completion.
Whether to store the response for later retrieval. Stored responses can be accessed via the retrieve endpoint.
Configuration for text response format. Supports plain text or structured JSON data (Structured Outputs).
List of tools the model may call. Can include built-in tools (code_interpreter, file_search, web_search, image_generation) or custom functions.
Tools the model may call while generating a response. Supports built-in tools (file_search, web_search, code_interpreter, image_generation), MCP Tools, and custom function calls.
How the model should select which tool (or tools) to use when generating a response.
none, auto, required Configuration for reasoning models only (e.g., o-series models).
Truncation strategy for handling context window limits.
auto, disabled Set of 16 key-value pairs for storing additional information. Keys max 64 characters, values max 512 characters.
Unique identifier representing the end-user, for abuse monitoring and detection.
Additional output data to include in the response. Examples: "code_interpreter_call.outputs", "file_search_call.results"
OK
Response object returned after creating a model response.
Unique identifier for the response
"resp_67cb32528d6881909eb2859a55e18a85"
Object type identifier
response Unix timestamp of when the response was created
1741369938
The model that was used to generate the response
The current status of the response
completed, incomplete, in_progress, failed Array of output items from the model
Concatenated text from all text outputs in the response
Token usage statistics for the response
Error information if the response failed