Token Counter

Count tokens and estimate costs for GPT-4, Claude, Gemini, Llama, and Mistral. Context window detection and overflow alerts.

POST /api/count-tokens

AI & LLM

<50ms avg latency

API Key auth

15+ models

What It Does

The Universal LLM Token Counter provides estimated token counting for text inputs across major language model providers. Whether you're working with OpenAI's GPT models, Anthropic's Claude, Google's Gemini, or other supported models, this API delivers token count estimates with real-time cost estimation.

Built on character-ratio estimation with pricing tables, the API provides consistent token estimates with sub-50ms response times. Perfect for applications that need to manage API costs, monitor context window usage, or prevent token overflow before sending requests to LLM providers.

Key Features

Multi-Provider Support — OpenAI, Anthropic, Google, Mistral, and Meta
Real-time Cost Estimation — Get USD pricing based on current provider rates
Context Window Monitoring — Track usage percentage and overflow risk
Deterministic Results — Same input always produces same token count
Lightning Fast — Sub-50ms response times for all operations

Code Examples

                curl -X POST https://api.atomicapis.dev/api/count-tokens \
  -H "X-RapidAPI-Proxy-Secret: YOUR_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "The quick brown fox jumps over the lazy dog. This is a sample text for token counting across multiple LLM providers.",
    "model": "gpt-4o"
  }'
              

                const response = await fetch('https://api.atomicapis.dev/api/count-tokens', {
  method: 'POST',
  headers: {
    'X-RapidAPI-Proxy-Secret': 'YOUR_SECRET',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'The quick brown fox jumps over the lazy dog. This is a sample text for token counting across multiple LLM providers.',
    model: 'gpt-4o'
  })
});

const result = await response.json();
console.log(result);
              

                import requests

response = requests.post(
    'https://api.atomicapis.dev/api/count-tokens',
    headers={
        'X-RapidAPI-Proxy-Secret': 'YOUR_SECRET',
        'Content-Type': 'application/json'
    },
    json={
        'text': 'The quick brown fox jumps over the lazy dog. This is a sample text for token counting across multiple LLM providers.',
        'model': 'gpt-4o'
    }
)

result = response.json()
print(result)
              

                using System.Net.Http.Json;

var client = new HttpClient();
client.DefaultRequestHeaders.Add("X-RapidAPI-Proxy-Secret", "YOUR_SECRET");

var request = new
{
    text = "The quick brown fox jumps over the lazy dog. This is a sample text for token counting across multiple LLM providers.",
    model = "gpt-4o"
};

var response = await client.PostAsJsonAsync(
    "https://api.atomicapis.dev/api/count-tokens", 
    request
);

var result = await response.Content.ReadFromJsonAsync<TokenResult>();
Console.WriteLine(result);
              

Request Parameters

Name	Type	Required	Description
`text`	string	Required	The text content to count tokens for. Maximum 1MB in size.
`model`	string	Optional	The model identifier (e.g., "gpt-4o", "claude-opus-4", "gemini-2.0-flash"). Defaults to `"gpt-4o"` if omitted.

Supported Models

gpt-4o gpt-4o-mini gpt-4-turbo gpt-3.5-turbo claude-sonnet-4-5 claude-opus-4 claude-haiku-3.5 gemini-2.0-flash gemini-1.5-pro llama-3.1-70b llama-3.1-8b mistral-large mistral-small o1 o3

Unrecognized model names are accepted but will use default estimation parameters.

Response Format

200 OK - Application/json

{
  "estimatedTokens": 23,
  "model": "gpt-4o",
  "tokenizerFamily": "o200k_base",
  "estimatedInputCost": 0.0000575,
  "estimatedOutputCost": 0.00023,
  "contextWindowSize": 128000,
  "contextUsagePercent": 0.02,
  "exceedsContext": false,
  "isExactCount": false,
  "method": "character_ratio_estimation"
}

Field	Type	Description
`estimatedTokens`	integer	Estimated number of tokens in the input text
`model`	string	The model used for token estimation
`tokenizerFamily`	string	The tokenizer family used (e.g., `o200k_base`, `cl100k_base`, `claude`)
`estimatedInputCost`	number	Estimated input cost in USD
`estimatedOutputCost`	number	Estimated output cost in USD
`contextWindowSize`	integer	Maximum context window size for the specified model
`contextUsagePercent`	number	Percentage of the model's context window used
`exceedsContext`	boolean	True if input exceeds the model's context window
`isExactCount`	boolean	Whether the token count is exact (always `false` for estimation-based counting)
`method`	string	Method used for counting (returns `"character_ratio_estimation"`)

Use Cases

Cost Estimation

Preview API costs before sending requests to LLM providers. Perfect for budgeting and client billing, especially when processing large volumes of text or offering usage-based pricing to your customers.

                      // Show estimated cost to user before processing
                      const estimate = await countTokens(text, model);
                      console.log(`Estimated cost: $${estimate.estimatedInputCost}`);
                    

Context Window Management

Monitor context window usage in real-time to prevent overflow errors. Automatically truncate or split content when approaching limits, ensuring reliable LLM interactions without unexpected failures.

                      // Prevent context overflow before sending
                      if (result.exceedsContext || result.contextUsagePercent > 90) {
                      // Split or truncate content
                      }
                    

API Budgeting

Implement spending limits and usage quotas in your applications. Track cumulative costs across multiple providers and enforce budget constraints before expensive operations.

                      // Enforce daily spending limits
                      const dailyBudget = 100.00;
                      if (dailySpend + estimate.estimatedInputCost > dailyBudget) {
                      throw new Error('Daily budget exceeded');
                      }
                    

Build Constraints

Character-Ratio Estimation

Uses calibrated characters-per-token ratios for each tokenizer family (e.g., 3.9 for o200k_base, 3.7 for cl100k_base, 3.5 for Claude). Averages character-based and word-based estimates for improved accuracy with sub-50ms response times.

Deterministic Counting

Token estimation is deterministic — the same input always produces the same token estimate. All operations complete in sub-50ms, including ratio calculation and cost estimation.

Pricing Tables

Maintain up-to-date pricing tables per provider. Prices vary by model and are charged per 1M tokens. Tables must be synchronized with provider pricing changes to ensure accurate cost estimation.

MCP Integration MCP Ready

What is MCP?

Model Context Protocol (MCP) allows AI assistants like Claude to call this API as a native tool during conversation. Instead of writing HTTP requests, the AI invokes the tool directly — no API keys or boilerplate needed on the client side.

Tool Details

Tool Class

TokenCounterTools

Method

CountTokens()

Description

Counts tokens and estimates costs for multiple LLM models