Create chat completion - InfinityBlue API

curl --request POST \
  --url https://api.getinfinityblue.com/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt-5.4",
  "messages": [
    {
      "role": "user",
      "content": "Introduce yourself in one sentence."
    }
  ]
}
'

{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "index": 123,
      "message": {
        "content": "<string>",
        "name": "<string>",
        "tool_calls": [
          {
            "id": "<string>",
            "type": "<string>",
            "function": {
              "name": "<string>",
              "arguments": "<string>"
            }
          }
        ],
        "tool_call_id": "<string>",
        "reasoning_content": "<string>"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123,
    "prompt_tokens_details": {
      "cached_tokens": 123,
      "text_tokens": 123,
      "audio_tokens": 123,
      "image_tokens": 123
    },
    "completion_tokens_details": {
      "text_tokens": 123,
      "audio_tokens": 123,
      "reasoning_tokens": 123
    }
  },
  "system_fingerprint": "<string>"
}

POST

chat

completions

curl --request POST \
  --url https://api.getinfinityblue.com/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "gpt-5.4",
  "messages": [
    {
      "role": "user",
      "content": "Introduce yourself in one sentence."
    }
  ]
}
'

{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "index": 123,
      "message": {
        "content": "<string>",
        "name": "<string>",
        "tool_calls": [
          {
            "id": "<string>",
            "type": "<string>",
            "function": {
              "name": "<string>",
              "arguments": "<string>"
            }
          }
        ],
        "tool_call_id": "<string>",
        "reasoning_content": "<string>"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123,
    "prompt_tokens_details": {
      "cached_tokens": 123,
      "text_tokens": 123,
      "audio_tokens": 123,
      "image_tokens": 123
    },
    "completion_tokens_details": {
      "text_tokens": 123,
      "audio_tokens": 123,
      "reasoning_tokens": 123
    }
  },
  "system_fingerprint": "<string>"
}

Models you can use

Pass any chat-capable model ID in model, for example:

Model ID	Notes
`gpt-5.4`	GPT-5 flagship — top reasoning / coding / agentic, 1M context
`gpt-5.4-mini`	Lightweight, balanced — great for high-volume and fallback
`gemini-3.1-pro-preview`	Gemini flagship — strong multimodal, 1M context
`deepseek-v4-pro`	DeepSeek cost-effective reasoning model

See GET /v1/models or the pricing page for the full list.

Streaming

Set stream: true to receive Server-Sent Events (SSE). Each line is data: {json} and the stream ends with data: [DONE]:

data: {"id":"chatcmpl-...","choices":[{"delta":{"content":"He"}}]}
data: {"id":"chatcmpl-...","choices":[{"delta":{"content":"llo"}}]}
data: [DONE]

Reasoning models

For reasoning-capable models, use reasoning_effort (low / medium / high) to control reasoning depth. The model returns its reasoning in the reasoning_content field — render it collapsed in your UI.

Tool calling

Define functions as JSON Schema in tools. The model returns structured tool_calls that your application executes and feeds back in a follow-up request. Use tool_choice to control the strategy (auto / none / required, or a specific function).

Authorizations

Authorization

string

header

required

Bearer token authentication, format: Authorization: Bearer sk-xxxxxx. Get your API key in the console.

Body

application/json

model

string

required

Model ID, e.g. gpt-5.4. See GET /v1/models for the full list.

Example:

"gpt-5.4"

messages

object[]

required

The messages comprising the conversation so far, in order.

Show child attributes

temperature

number

default:1

Sampling temperature between 0 and 2. Higher values (e.g. 0.8) make output more random; lower values (e.g. 0.2) make it more focused and deterministic. Tune this or top_p, not both.

Required range: 0 <= x <= 2

top_p

number

default:1

Nucleus sampling. The model considers only tokens within the top top_p cumulative probability mass — e.g. 0.1 means only the top 10%. Tune this or temperature, not both.

Required range: 0 <= x <= 1

integer

default:1

Number of completions to generate for each input message.

Required range: x >= 1

stream

boolean

default:false

Whether to stream the response as Server-Sent Events.

stream_options

object

Options for streaming, only used when stream=true.

Show child attributes

stop

Up to 4 stop sequences. Generation stops at any of them.

max_tokens

integer

Maximum tokens to generate in the completion (legacy). Use max_completion_tokens for reasoning models.

max_completion_tokens

integer

Maximum tokens to generate, including reasoning tokens.

presence_penalty

number

default:0

Between -2.0 and 2.0. Positive values penalize tokens that have already appeared, increasing the model's likelihood to talk about new topics.

Required range: -2 <= x <= 2

frequency_penalty

number

default:0

Between -2.0 and 2.0. Positive values penalize tokens based on their existing frequency, decreasing verbatim repetition.

Required range: -2 <= x <= 2

logit_bias

object

Bias map adjusting token likelihoods; keys are token IDs, values -100 to 100.

Show child attributes

user

string

A unique identifier for your end user, useful for abuse monitoring.

tools

object[]

A list of tools the model may call. Currently only function is supported.

Show child attributes

tool_choice

Controls whether and how the model calls tools. none disables calls, auto lets the model decide, required forces at least one call; or pass an object to force a specific function.

Available options:

none,

auto,

required

response_format

object

Controls the format of the model output.

Show child attributes

seed

integer

Random seed. The same seed and params return results as consistent as possible.

reasoning_effort

enum<string>

Reasoning depth, only effective for reasoning-capable models.

Available options:

low,

medium,

high

modalities

enum<string>[]

The output modalities you want the model to return.

Available options:

text,

audio

audio

object

Audio output parameters, used when modalities includes audio.

Show child attributes

Response

Successful response

string

Unique identifier for this completion.

object

string

Example:

"chat.completion"

created

integer

Unix timestamp (seconds) of creation.

model

string

The model that actually processed the request.

choices

object[]

The list of completions generated by the model.

Show child attributes

usage

object

Token usage statistics for the request.

Show child attributes

system_fingerprint

string

List models · Gemini format Responses format

⌘I

​Models you can use

​Streaming

​Reasoning models

​Tool calling

Authorizations

Body

Response

Models you can use

Streaming

Reasoning models

Tool calling