POST
/
v1
/
chat
/
completions
import { AtomaSDK } from "atoma-sdk";

const atomaSDK = new AtomaSDK({
  bearerAuth: process.env["ATOMASDK_BEARER_AUTH"] ?? "",
});

async function run() {
  const completion = await atomaSDK.chat.create({
    messages: [
      {"role": "developer", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello!"}
    ],
    model: "meta-llama/Llama-3.3-70B-Instruct"
  });

  console.log(completion.choices[0]);
}

run();
{
  "choices": "[{\"index\": 0, \"message\": {\"role\": \"assistant\", \"content\": \"Hello! How can you help me today?\"}, \"finish_reason\": null, \"stop_reason\": null}]",
  "created": 1677652288,
  "id": "chatcmpl-123",
  "model": "meta-llama/Llama-3.3-70B-Instruct",
  "object": "chat.completion",
  "service_tier": "auto",
  "system_fingerprint": "fp_44709d6fcb",
  "usage": null
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Represents the create chat completion request.

This is used to represent the create chat completion request in the chat completion request. It can be either a chat completion or a chat completion stream.

messages
object[]
required

A list of messages comprising the conversation so far

A message that is part of a conversation which is based on the role of the author of the message.

This is used to represent the message in the chat completion request. It can be either a system message, a user message, an assistant message, or a tool message.

Example:
[
  {
    "role": "system",
    "content": "You are a helpful AI assistant"
  },
  { "role": "user", "content": "Hello!" },
  {
    "role": "assistant",
    "content": "I'm here to help you with any questions you have. How can I assist you today?"
  }
]
model
string
required

ID of the model to use

Example:

"meta-llama/Llama-3.3-70B-Instruct"

frequency_penalty
number | null

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far

Example:

0

function_call
any

Controls how the model responds to function calls

functions
any[] | null

A list of functions the model may generate JSON inputs for

Example:
[
  {
    "name": "get_current_weather",
    "description": "Get the current weather in a location",
    "parameters": {
      "type": "object",
      "properties": {
        "location": {
          "type": "string",
          "description": "The location to get the weather for"
        }
      },
      "required": ["location"]
    }
  }
]
logit_bias
object | null

Modify the likelihood of specified tokens appearing in the completion.

Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.

Example:
{ "1234567890": 0.5, "1234567891": -0.5 }
max_completion_tokens
integer | null

The maximum number of tokens to generate in the chat completion

Example:

4096

max_tokens
integer | null
deprecated

The maximum number of tokens to generate in the chat completion

Example:

4096

n
integer | null

How many chat completion choices to generate for each input message

Example:

1

parallel_tool_calls
boolean | null

Whether to enable parallel tool calls.

Example:

true

presence_penalty
number | null

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far

Example:

0

response_format
object | null

The format to return the response in

seed
integer | null

If specified, our system will make a best effort to sample deterministically

Example:

123

service_tier
string | null

Specifies the latency tier to use for processing the request. This parameter is relevant for customers subscribed to the scale tier service:

If set to 'auto', and the Project is Scale tier enabled, the system will utilize scale tier credits until they are exhausted. If set to 'auto', and the Project is not Scale tier enabled, the request will be processed using the default service tier with a lower uptime SLA and no latency guarantee. If set to 'default', the request will be processed using the default service tier with a lower uptime SLA and no latency guarantee. When not set, the default behavior is 'auto'.

Example:

"auto"

stop
string[] | null

Up to 4 sequences where the API will stop generating further tokens

Example:

"json([\"stop\", \"halt\"])"

stream
boolean | null
default:false

Whether to stream back partial progress. Must be false for this request type.

Example:

false

stream_options
object | null

Options for streaming response. Only set this when you set stream: true.

temperature
number | null

What sampling temperature to use, between 0 and 2

Example:

0.7

tool_choice

Controls which (if any) tool the model should use

Available options:
none,
auto
tools
object[] | null

A list of tools the model may call

A tool that can be used in a chat completion.

Example:
[
  {
    "type": "function",
    "function": {
      "name": "get_current_weather",
      "description": "Get the current weather in a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "The location to get the weather for"
          }
        },
        "required": ["location"]
      }
    }
  }
]
top_logprobs
integer | null

An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used.

Example:

1

top_p
number | null

An alternative to sampling with temperature

Example:

1

user
string | null

A unique identifier representing your end-user

Example:

"user-1234"

Response

200
application/json
Chat completions

Represents the chat completion response.

This is used to represent the chat completion response in the chat completion request. It can be either a chat completion or a chat completion stream.

choices
object[]
required

A list of chat completion choices.

Represents the chat completion choice.

This is used to represent the chat completion choice in the chat completion request. It can be either a chat completion message or a chat completion chunk.

Example:

"[{\"index\": 0, \"message\": {\"role\": \"assistant\", \"content\": \"Hello! How can you help me today?\"}, \"finish_reason\": null, \"stop_reason\": null}]"

created
integer
required

The Unix timestamp (in seconds) of when the chat completion was created.

Example:

1677652288

id
string
required

A unique identifier for the chat completion.

Example:

"chatcmpl-123"

model
string
required

The model used for the chat completion.

Example:

"meta-llama/Llama-3.3-70B-Instruct"

object
string
required

The object of the chat completion.

Example:

"chat.completion"

service_tier
string | null

The service tier of the chat completion.

Example:

"auto"

system_fingerprint
string | null

The system fingerprint for the completion, if applicable.

Example:

"fp_44709d6fcb"

usage
object | null

Usage statistics for the completion request.