Skip to main content
POST
/
vendors
/
openai
/
v1
/
responses
Create model response
curl --request POST \
  --url https://api.mulerun.com/vendors/openai/v1/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt-5.2",
  "input": "What is the capital of France?"
}
'
{
  "id": "resp_67cb32528d6881909eb2859a55e18a85",
  "object": "response",
  "created_at": 1741369938,
  "model": "gpt-5.2",
  "status": "completed",
  "output": [
    {
      "id": "msg_67cb3252cfac8190865744873aada798",
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "The capital of France is Paris."
        }
      ]
    }
  ],
  "output_text": "The capital of France is Paris.",
  "usage": {
    "prompt_tokens": 33,
    "completion_tokens": 8,
    "total_tokens": 41,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  }
}
This API is compatible with OpenAI’s Responses API format.For more details, please refer to OpenAI’s official documentation.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
model
string
required

Model deployment to use for response creation.

Example:

"gpt-5.2"

input

Input content for the model. Can be a simple text string or an array of structured message objects.

Example:

"Hello! How are you?"

instructions
string | null

System/developer message inserted into the model context to guide behavior, tone, and goals.

previous_response_id
string | null

ID of a previous response to use for multi-turn conversations and context continuation.

prompt
object

Reference to a prompt template with variables.

max_output_tokens
integer | null

Upper bound for the number of tokens that can be generated (includes reasoning tokens).

Required range: x >= 1
max_tool_calls
integer | null

Maximum total number of built-in tool calls allowed in the response.

temperature
number | null

Sampling temperature between 0 and 2. Higher values make output more random, lower values make it more focused and deterministic.

Required range: 0 <= x <= 2
Example:

0.7

top_p
number | null

Nucleus sampling parameter. Model considers tokens with top_p probability mass. For example, 0.1 means only tokens comprising the top 10% probability mass are considered.

Required range: 0 <= x <= 1
Example:

1

top_logprobs
integer | null

Number of most likely tokens to return at each token position (0-20).

Required range: 0 <= x <= 20
parallel_tool_calls
boolean | null

Allow the model to run tool calls in parallel during tool use.

stream
boolean | null

If true, the response will be streamed back using server-sent events.

background
boolean | null

If true, the response will run in the background. Use GET /v1/responses/{response_id} to poll for completion.

store
boolean | null

Whether to store the response for later retrieval. Stored responses can be accessed via the retrieve endpoint.

text
object

Configuration for text response format. Supports plain text or structured JSON data (Structured Outputs).

tools
object[] | null

List of tools the model may call. Can include built-in tools (code_interpreter, file_search, web_search, image_generation) or custom functions.

Tools the model may call while generating a response. Supports built-in tools (file_search, web_search, code_interpreter, image_generation), MCP Tools, and custom function calls.

tool_choice

How the model should select which tool (or tools) to use when generating a response.

Available options:
none,
auto,
required
reasoning
object

Configuration for reasoning models only (e.g., o-series models).

truncation
enum<string> | null

Truncation strategy for handling context window limits.

Available options:
auto,
disabled
metadata
object

Set of 16 key-value pairs for storing additional information. Keys max 64 characters, values max 512 characters.

user
string | null

Unique identifier representing the end-user, for abuse monitoring and detection.

include
string[] | null

Additional output data to include in the response. Examples: "code_interpreter_call.outputs", "file_search_call.results"

Response

200 - application/json

OK

Response object returned after creating a model response.

id
string
required

Unique identifier for the response

Example:

"resp_67cb32528d6881909eb2859a55e18a85"

object
enum<string>
required

Object type identifier

Available options:
response
created_at
number
required

Unix timestamp of when the response was created

Example:

1741369938

model
string
required

The model that was used to generate the response

status
enum<string>
required

The current status of the response

Available options:
completed,
incomplete,
in_progress,
failed
output
(Message output · object | Reasoning output · object | Function call output · object)[]

Array of output items from the model

output_text
string | null

Concatenated text from all text outputs in the response

usage
object

Token usage statistics for the response

error
object

Error information if the response failed