Create chat completions
This function processes chat completion requests by determining whether to use streaming or non-streaming response handling based on the request payload. For streaming requests, it configures additional options to track token usage.
Returns
Returns a Response containing either:
- A streaming SSE connection for real-time completions
- A single JSON response for non-streaming completions
Errors
Returns an error status code if:
- The request processing fails
- The streaming/non-streaming handlers encounter errors
- The underlying inference service returns an error
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
Represents the create chat completion request.
This is used to represent the create chat completion request in the chat completion request. It can be either a chat completion or a chat completion stream. Represents the chat completion request.
This is used to represent the chat completion request in the chat completion request. It can be either a chat completion or a chat completion stream.
Response
Chat completions
Represents the chat completion response.
This is used to represent the chat completion response in the chat completion request. It can be either a chat completion or a chat completion stream.