流式响应
设置stream: true 即可逐 token 接收模型输出,适合聊天 UI 实时显示、减少用户等待感知。
快速示例
curl https://api.getinfinityblue.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4",
"stream": true,
"messages": [{"role": "user", "content": "数到 10"}]
}'
data: {json},流以 data: [DONE] 结束。
每个事件的 JSON 结构如下:
{
"id": "chatcmpl-9f3a8b2e1c0d",
"object": "chat.completion.chunk",
"created": 1717430000,
"model": "gpt-5.4",
"choices": [
{
"index": 0,
"delta": {
"content": "1"
},
"finish_reason": null
}
]
}
finish_reason 会变为 "stop"(或 "length"),之后紧跟 data: [DONE]。
Python 示例
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.getinfinityblue.com/v1",
)
with client.chat.completions.stream(
model="gpt-5.4",
messages=[{"role": "user", "content": "数到 10"}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
JavaScript / Node 示例
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "YOUR_API_KEY",
baseURL: "https://api.getinfinityblue.com/v1",
});
const stream = client.chat.completions.stream({
model: "gpt-5.4",
messages: [{ role: "user", content: "数到 10" }],
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content ?? "";
process.stdout.write(content);
}
浏览器端
EventSource 不支持自定义请求头,需改用 fetch + ReadableStream:
const response = await fetch("https://api.getinfinityblue.com/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gpt-5.4",
stream: true,
messages: [{ role: "user", content: "数到 10" }],
}),
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = "";
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split("\n");
buffer = lines.pop() || ""; // 保留最后一个可能不完整的行
for (const line of lines) {
if (line.startsWith("data: ") && line !== "data: [DONE]") {
const chunk = JSON.parse(line.slice(6));
const content = chunk.choices[0]?.delta?.content ?? "";
// 更新 DOM,例如:
// document.getElementById("output").textContent += content;
console.log(content); // 或其他浏览器端渲染方式
}
}
}
流式模式下
usage 字段默认不出现在中间 chunk 中。如果需要 token 统计,请在最后一个 chunk 或单独的非流式请求中获取。