Token Counter
Count tokens and estimate costs for GPT-4, Claude, Gemini, Llama, and Mistral. Context window detection and overflow alerts.
/api/count-tokens
What It Does
The Universal LLM Token Counter provides estimated token counting for text inputs across major language model providers. Whether you're working with OpenAI's GPT models, Anthropic's Claude, Google's Gemini, or other supported models, this API delivers token count estimates with real-time cost estimation.
Built on character-ratio estimation with pricing tables, the API provides consistent token estimates with sub-50ms response times. Perfect for applications that need to manage API costs, monitor context window usage, or prevent token overflow before sending requests to LLM providers.
Key Features
- Multi-Provider Support — OpenAI, Anthropic, Google, Mistral, and Meta
- Real-time Cost Estimation — Get USD pricing based on current provider rates
- Context Window Monitoring — Track usage percentage and overflow risk
- Deterministic Results — Same input always produces same token count
- Lightning Fast — Sub-50ms response times for all operations
Code Examples
curl -X POST https://api.atomicapis.dev/api/count-tokens \
-H "X-RapidAPI-Proxy-Secret: YOUR_SECRET" \
-H "Content-Type: application/json" \
-d '{
"text": "The quick brown fox jumps over the lazy dog. This is a sample text for token counting across multiple LLM providers.",
"model": "gpt-4o"
}'
const response = await fetch('https://api.atomicapis.dev/api/count-tokens', {
method: 'POST',
headers: {
'X-RapidAPI-Proxy-Secret': 'YOUR_SECRET',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'The quick brown fox jumps over the lazy dog. This is a sample text for token counting across multiple LLM providers.',
model: 'gpt-4o'
})
});
const result = await response.json();
console.log(result);
import requests
response = requests.post(
'https://api.atomicapis.dev/api/count-tokens',
headers={
'X-RapidAPI-Proxy-Secret': 'YOUR_SECRET',
'Content-Type': 'application/json'
},
json={
'text': 'The quick brown fox jumps over the lazy dog. This is a sample text for token counting across multiple LLM providers.',
'model': 'gpt-4o'
}
)
result = response.json()
print(result)
using System.Net.Http.Json;
var client = new HttpClient();
client.DefaultRequestHeaders.Add("X-RapidAPI-Proxy-Secret", "YOUR_SECRET");
var request = new
{
text = "The quick brown fox jumps over the lazy dog. This is a sample text for token counting across multiple LLM providers.",
model = "gpt-4o"
};
var response = await client.PostAsJsonAsync(
"https://api.atomicapis.dev/api/count-tokens",
request
);
var result = await response.Content.ReadFromJsonAsync<TokenResult>();
Console.WriteLine(result);
Request Parameters
| Name | Type | Required | Description |
|---|---|---|---|
text |
string | Required | The text content to count tokens for. Maximum 1MB in size. |
model |
string | Optional | The model identifier (e.g., "gpt-4o", "claude-opus-4", "gemini-2.0-flash"). Defaults to "gpt-4o" if omitted. |
Supported Models
gpt-4o
gpt-4o-mini
gpt-4-turbo
gpt-3.5-turbo
claude-sonnet-4-5
claude-opus-4
claude-haiku-3.5
gemini-2.0-flash
gemini-1.5-pro
llama-3.1-70b
llama-3.1-8b
mistral-large
mistral-small
o1
o3
Unrecognized model names are accepted but will use default estimation parameters.
Response Format
{
"estimatedTokens": 23,
"model": "gpt-4o",
"tokenizerFamily": "o200k_base",
"estimatedInputCost": 0.0000575,
"estimatedOutputCost": 0.00023,
"contextWindowSize": 128000,
"contextUsagePercent": 0.02,
"exceedsContext": false,
"isExactCount": false,
"method": "character_ratio_estimation"
}
| Field | Type | Description |
|---|---|---|
estimatedTokens |
integer | Estimated number of tokens in the input text |
model |
string | The model used for token estimation |
tokenizerFamily |
string | The tokenizer family used (e.g., o200k_base, cl100k_base, claude) |
estimatedInputCost |
number | Estimated input cost in USD |
estimatedOutputCost |
number | Estimated output cost in USD |
contextWindowSize |
integer | Maximum context window size for the specified model |
contextUsagePercent |
number | Percentage of the model's context window used |
exceedsContext |
boolean | True if input exceeds the model's context window |
isExactCount |
boolean | Whether the token count is exact (always false for estimation-based counting) |
method |
string | Method used for counting (returns "character_ratio_estimation") |
Use Cases
Cost Estimation
Preview API costs before sending requests to LLM providers. Perfect for budgeting and client billing, especially when processing large volumes of text or offering usage-based pricing to your customers.
// Show estimated cost to user before processing
const estimate = await countTokens(text, model);
console.log(`Estimated cost: $${estimate.estimatedInputCost}`);
Context Window Management
Monitor context window usage in real-time to prevent overflow errors. Automatically truncate or split content when approaching limits, ensuring reliable LLM interactions without unexpected failures.
// Prevent context overflow before sending
if (result.exceedsContext || result.contextUsagePercent > 90) {
// Split or truncate content
}
API Budgeting
Implement spending limits and usage quotas in your applications. Track cumulative costs across multiple providers and enforce budget constraints before expensive operations.
// Enforce daily spending limits
const dailyBudget = 100.00;
if (dailySpend + estimate.estimatedInputCost > dailyBudget) {
throw new Error('Daily budget exceeded');
}
Build Constraints
Character-Ratio Estimation
Uses calibrated characters-per-token ratios for each tokenizer family (e.g., 3.9 for o200k_base, 3.7 for cl100k_base, 3.5 for Claude). Averages character-based and word-based estimates for improved accuracy with sub-50ms response times.
Deterministic Counting
Token estimation is deterministic — the same input always produces the same token estimate. All operations complete in sub-50ms, including ratio calculation and cost estimation.
Pricing Tables
Maintain up-to-date pricing tables per provider. Prices vary by model and are charged per 1M tokens. Tables must be synchronized with provider pricing changes to ensure accurate cost estimation.
MCP Integration MCP Ready
What is MCP?
Model Context Protocol (MCP) allows AI assistants like Claude to call this API as a native tool during conversation. Instead of writing HTTP requests, the AI invokes the tool directly — no API keys or boilerplate needed on the client side.
Tool Details
TokenCounterTools
CountTokens()
Description
Counts tokens and estimates costs for multiple LLM models