POST
/
v1
/
chat
/
completions
curl --request POST \
  --url http://localhost:9000/v1/chat/completions \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "<string>",
  "messages": [
    {
      "role": "system",
      "content": "<string>"
    }
  ],
  "temperature": 1,
  "top_p": 1,
  "stream": false
}'
{
  "id": "<string>",
  "object": "chat.completion",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "assistant",
        "content": "<string>"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  }
}

Body

application/json
messages
object[]
required

A list of messages comprising the conversation so far.

model
string

ID of the model to use.

temperature
number
default:1

What sampling temperature to use, between 0 and 2.

Required range: 0 <= x <= 2
top_p
number
default:1

An alternative to sampling with temperature, called nucleus sampling

Required range: 0 <= x <= 1
stream
boolean
default:false

If set, partial message deltas will be sent.

Response

200 - application/json
OK

Represents a chat completion response returned by model, based on the provided input.

object
enum<string>
required

The object type, which is always "chat.completion"

Available options:
chat.completion
choices
object[]
required

A list of chat completion choices.

id
string

A unique identifier for the chat completion.

created
integer

The Unix timestamp (in seconds) of when the chat completion was created.

model
string

The model used for the chat completion.

usage
object