Create Chat Completion
Last updated
Was this helpful?
Last updated
Was this helpful?
Model
Select GPT Model
Messages
A list of messages comprising the conversation so far.
Refer to "How to make messages?" section below this page.
Store
Default: false
Reasoning Effort
low, medium(default), high o-series models only
Metadata
Set of 16 key-value pairs that can be attached to an object. (Optional)
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters. Refer to "What is Metadata?" section below this page.
Frequency penalty
An optional parameter that controls repetition of the same words or phrases. (Optional)
Allow Range : -2.0 ~ 2.0 (Default : 0)
Logic bias
Modify the likelihood of specified tokens appearing in the completion.
(Optional)
Refer to "What is Logic Bias?" section below this page.
Max Completion Tokens
N
The number of responses to generate per request. (Optional)
A positive integer (default is 1)
Precense penalty
A parameter that encourages the model to explore topics not previously mentioned. (Optional)
Allow Range : -2.0 ~ 2.0 (Default : 0)
Response Type
Select Response Type (Optional)
Text: Returns the response as a plain string (Default)
JSON Object: Returns the response as a JSON object
JSON Schema: Returns the response in the specified JSON Schema format
JSON Schema
Setting to receive the response as a desired JSON object. (Optional)
Required when specifying the response data type as JSON Schema. Refer to "What is JSON Schema?" section below this page.
Service Tier
Specifies the latency tier to use for processing the request. This parameter is relevant for customers subscribed to the scale tier service (Optional)
Auto(Default)
Default
Stop
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence. (Optional)
Use commas (,) to separate multiple settings Example: stop,halt
Temperature
A parameter that controls randomness during the model's response generation (Optional).
Allow Range : 0.0 ~ 2.0 (Default 1)
Top P
Limits sampling to only the top-p portion of the probability distribution during response generation. (Optional)
Allow Range : 0.0 ~ 1.0 (Default 1)
Tools
Specifies the list of functions the model is allowed to call (Optional).
Refer to "What is Tools?" section below this page.
Tool Choice
Optional setting to determine whether the model should call a tool. (Optional)
Refer to "What is Tool Choice?" section below this page.
Tool Choice Option
Setting to enable or disable parallel tool calls. (Optional)
None (default) Yes No
User
Setting to assign a unique ID for identifying the end user. (Optional)
The message content is structured as an array of message objects, and each message follows the format below.
content
Message Content
role
Message Role
system: Defines the purpose or behavior of the AI model for the conversation.
user: Represents the message or question sent by the user to the AI—this is the actual input the AI should respond to.
assistant: Represents the AI model’s response to the user’s input.
developer: Provides additional information to the AI model from the developer.
A field provided to allow users to store additional information related to the API request.
A feature that allows you to forcibly adjust the probability of specific words (or tokens) appearing.
A setting that emphasizes the model to respond in a specific JSON format.
A feature that allows the model to call specific functions. In other words, the model can generate JSON directly to invoke APIs or interact with external systems.
An option that determines how the model chooses to use a tool.
auto
The model will call the tool if needed, and will not call it if unnecessary. (Default setting)
none
The model will never call a tool. Only text responses are allowed.
required
The model must call at least one tool.
{ "type": "function", "function": { "name": "my_function" } }
Forces the model to call a specific tool (my_function).
id
Unique identifier of the response request
object
The type of the response object, indicating a conversational reply from GPT.
created
The Unix timestamp (in seconds) when the response was generated.
model
The name of the GPT model used to generate the response.
system_fingerprint
A unique system fingerprint used for internal logging and tracking.
choices[].index
The index of the response, used to distinguish between multiple responses in a single request.
choices[].message.role
The role of the entity that generated the message.
choices[].message.content
The content of the generated response.
choices[].message.refusal
Indicates whether the request was denied.
choices[].logprobs
Log probabilities used during response generation.
choices[].finish_reason
The reason why the response generation was stopped.
usage.prompt_tokens
The number of tokens used in the input prompt.
usage.completion_tokens
The number of tokens used in the generated response (completion).
usage.total_tokens
The total number of tokens used, including both the input prompt and the generated response.
usage.prompt_tokens_details.cached_tokens
The number of cached prompt tokens.
usage.completion_tokens_details.reasoning_tokens
The number of tokens used for reasoning during response generation.
usage.completion_tokens_details.accepted_prediction_tokens
The number of predicted tokens accepted by the model during response generation.
usage.completion_tokens_details.rejected_prediction_tokens
The number of predicted tokens rejected by the model during response generation.
Whether or not to store the output of this chat completion request for use in our or products.
Constrains effort on reasoning for . (Optional)
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and . (Optional)