Create Model Response

Create model response

curl --request POST \
  --url https://api.mulerun.com/vendors/openai/v1/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt-5.2",
  "input": "What is the capital of France?"
}
'

{
  "id": "resp_67cb32528d6881909eb2859a55e18a85",
  "object": "response",
  "created_at": 1741369938,
  "model": "gpt-5.2",
  "status": "completed",
  "output": [
    {
      "id": "msg_67cb3252cfac8190865744873aada798",
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "The capital of France is Paris."
        }
      ]
    }
  ],
  "output_text": "The capital of France is Paris.",
  "usage": {
    "prompt_tokens": 33,
    "completion_tokens": 8,
    "total_tokens": 41,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  }
}

POST

vendors

openai

responses

Create model response

curl --request POST \
  --url https://api.mulerun.com/vendors/openai/v1/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt-5.2",
  "input": "What is the capital of France?"
}
'

{
  "id": "resp_67cb32528d6881909eb2859a55e18a85",
  "object": "response",
  "created_at": 1741369938,
  "model": "gpt-5.2",
  "status": "completed",
  "output": [
    {
      "id": "msg_67cb3252cfac8190865744873aada798",
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "The capital of France is Paris."
        }
      ]
    }
  ],
  "output_text": "The capital of France is Paris.",
  "usage": {
    "prompt_tokens": 33,
    "completion_tokens": 8,
    "total_tokens": 41,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  }
}

This API is compatible with OpenAI’s Responses API format.For more details, please refer to OpenAI’s official documentation.

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

model

string

required

Model deployment to use for response creation.

Example:

"gpt-5.2"

input

Input content for the model. Can be a simple text string or an array of structured message objects.

Example:

"Hello! How are you?"

instructions

string | null

System/developer message inserted into the model context to guide behavior, tone, and goals.

previous_response_id

string | null

ID of a previous response to use for multi-turn conversations and context continuation.

prompt

object

Reference to a prompt template with variables.

Show child attributes

max_output_tokens

integer | null

Upper bound for the number of tokens that can be generated (includes reasoning tokens).

Required range: x >= 1

max_tool_calls

integer | null

Maximum total number of built-in tool calls allowed in the response.

temperature

number | null

Sampling temperature between 0 and 2. Higher values make output more random, lower values make it more focused and deterministic.

Required range: 0 <= x <= 2

Example:

0.7

top_p

number | null

Nucleus sampling parameter. Model considers tokens with top_p probability mass. For example, 0.1 means only tokens comprising the top 10% probability mass are considered.

Required range: 0 <= x <= 1

Example:

1

top_logprobs

integer | null

Number of most likely tokens to return at each token position (0-20).

Required range: 0 <= x <= 20

parallel_tool_calls

boolean | null

Allow the model to run tool calls in parallel during tool use.

stream

boolean | null

If true, the response will be streamed back using server-sent events.

background

boolean | null

If true, the response will run in the background. Use GET /v1/responses/{response_id} to poll for completion.

store

boolean | null

Whether to store the response for later retrieval. Stored responses can be accessed via the retrieve endpoint.

text

object

Configuration for text response format. Supports plain text or structured JSON data (Structured Outputs).

Show child attributes

tools

object[] | null

List of tools the model may call. Can include built-in tools (code_interpreter, file_search, web_search, image_generation) or custom functions.

Tools the model may call while generating a response. Supports built-in tools (file_search, web_search, code_interpreter, image_generation), MCP Tools, and custom function calls.

Show child attributes

tool_choice

How the model should select which tool (or tools) to use when generating a response.

Available options:

none,

auto,

required

reasoning

object

Configuration for reasoning models only (e.g., o-series models).

Show child attributes

truncation

enum<string> | null

Truncation strategy for handling context window limits.

Available options:

auto,

disabled

metadata

object

Set of 16 key-value pairs for storing additional information. Keys max 64 characters, values max 512 characters.

Show child attributes

user

string | null

Unique identifier representing the end-user, for abuse monitoring and detection.

include

string[] | null

Additional output data to include in the response. Examples: "code_interpreter_call.outputs", "file_search_call.results"

Response

200 - application/json

Response object returned after creating a model response.

string

required

Unique identifier for the response

Example:

"resp_67cb32528d6881909eb2859a55e18a85"

object

enum<string>

required

Object type identifier

Available options:

response

created_at

number

required

Unix timestamp of when the response was created

Example:

1741369938

model

string

required

The model that was used to generate the response

status

enum<string>

required

The current status of the response

Available options:

completed,

incomplete,

in_progress,

failed

output

(Message output · object | Reasoning output · object | Function call output · object)[]

Array of output items from the model

Message output
Reasoning output
Function call output

Show child attributes

output_text

string | null

Concatenated text from all text outputs in the response

usage

object

Token usage statistics for the response

Show child attributes

error

object

Error information if the response failed

Show child attributes

Create Chat Completion Supported Models

⌘I

Using the APIs

API reference

MCP

LLM

Image

Video

MuleRun OpenAPI

Authorizations

Body

Response