Create model response (Responses format)

curl --request POST \
  --url https://api.getinfinityblue.com/v1/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt-5.4",
  "input": "Introduce yourself in one sentence."
}
'

{
  "id": "<string>",
  "object": "<string>",
  "created_at": 123,
  "model": "<string>",
  "output": [
    {
      "type": "<string>",
      "id": "<string>",
      "status": "<string>",
      "role": "<string>",
      "content": [
        {
          "type": "<string>",
          "text": "<string>"
        }
      ]
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123,
    "prompt_tokens_details": {
      "cached_tokens": 123,
      "text_tokens": 123,
      "audio_tokens": 123,
      "image_tokens": 123
    },
    "completion_tokens_details": {
      "text_tokens": 123,
      "audio_tokens": 123,
      "reasoning_tokens": 123
    }
  }
}

POST

responses

curl --request POST \
  --url https://api.getinfinityblue.com/v1/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt-5.4",
  "input": "Introduce yourself in one sentence."
}
'

{
  "id": "<string>",
  "object": "<string>",
  "created_at": 123,
  "model": "<string>",
  "output": [
    {
      "type": "<string>",
      "id": "<string>",
      "status": "<string>",
      "role": "<string>",
      "content": [
        {
          "type": "<string>",
          "text": "<string>"
        }
      ]
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123,
    "prompt_tokens_details": {
      "cached_tokens": 123,
      "text_tokens": 123,
      "audio_tokens": 123,
      "image_tokens": 123
    },
    "completion_tokens_details": {
      "text_tokens": 123,
      "audio_tokens": 123,
      "reasoning_tokens": 123
    }
  }
}

Models you can use

Model ID	Notes
`gpt-5.4`	GPT-5 flagship — top reasoning / coding / agentic, 1M context
`gpt-5.4-mini`	Lightweight, balanced — great for high-volume and fallback
`deepseek-v4-pro`	DeepSeek cost-effective reasoning model

See GET /v1/models for the full list.

Multi-turn continuation

Pass the id from a previous response as previous_response_id to continue the conversation without resending the full history.

Reasoning control

For reasoning-capable models, use reasoning.effort (low / medium / high) to set reasoning depth, and reasoning.summary (auto / concise / detailed) to control how much reasoning detail is returned.

Context truncation

Set truncation to auto to let the system automatically drop older context when the window is exceeded. Set to disabled to return an error instead.

Authorizations

Authorization

string

header

required

Bearer token authentication, format: Authorization: Bearer sk-xxxxxx. Get your API key in the console.

Body

application/json

OpenAI Responses API request body.

model

string

required

Model ID, e.g. gpt-5.4. See GET /v1/models for the full list.

Example:

"gpt-5.4"

input

Input content — either a plain text string or an array of messages. Omit when using previous_response_id to continue a prior turn.

instructions

string

System-level instructions, equivalent to a system message in Chat Completions.

max_output_tokens

integer

Maximum number of tokens the model may generate in this response, including reasoning tokens.

temperature

number

Sampling temperature between 0 and 2, controlling output randomness.

Required range: 0 <= x <= 2

top_p

number

Nucleus sampling threshold. Tune this or temperature, not both.

Required range: 0 <= x <= 1

stream

boolean

default:false

Whether to stream the response as Server-Sent Events.

tools

object[]

A list of tools the model may call.

tool_choice

Tool calling strategy — auto, none, or required as a string, or an object specifying a particular tool.

Available options:

auto,

none,

required

reasoning

object

Reasoning configuration, only effective for reasoning-capable models.

Show child attributes

previous_response_id

string

The id of a prior response. When set, the conversation continues from that point without resending history.

truncation

enum<string>

Context truncation strategy. auto drops older context when the window is exceeded; disabled returns an error instead.

Available options:

auto,

disabled

Response

Successful response

OpenAI Responses API response body.

string

Unique identifier for this response, usable as previous_response_id in the next turn.

object

string

Object type, value is response.

Example:

"response"

created_at

integer

Unix timestamp (seconds) of creation.

status

enum<string>

Response status.

Available options:

completed,

failed,

in_progress,

incomplete

model

string

The model that actually processed the request.

output

object[]

List of output blocks generated by the model.

Show child attributes

usage

object

Token usage statistics for the request.

Show child attributes

ChatCompletions format Gemini native format

⌘I

​Models you can use

​Multi-turn continuation

​Reasoning control

​Context truncation

Authorizations

Body

Response

Models you can use

Multi-turn continuation

Reasoning control

Context truncation