Count Your Tokens Before They Hatch!

Last Updated: 2026-06-07

The various subscription plans offered by Anthropic, OpenAI and Google are designed to obfuscate the underlying cost. It's better to maintain your own calculations of input, thought and output tokens and their cost to optimise token usage and avoid bill-shock.

Token Pricing

As an example the pricing for all Google's models is avaiable here. I use the gemini-2.5-flash-lite model exclusively for the moment. This costs:

$0.10 per 1M input tokens
$0.40 per 1M output tokens

Input Cost

To calculate the input cost I take the input token count, divide by one million and multiply by $0.10. For example if my input token count is 800 the cost will be 800 / 1,000,000 * 0.1 which is a total of $0.00008.

Output Cost

For output costs, divie the output token count by one million and multiply by $0.40. For example output token count of 50 equals 50 / 1,000,000 * 0.4 which is $0.00002.

Maintain Live Count And Aggregate Cost

In your application calclate the cost of each call to the LLM and maintain a live count of input, thought and output tokens and the corresponding aggregate cost of each.

This will give you granular control over your token usage and costs. In the event the terms of the subscription you are on change you also have a good body of data to accurately predict your new token and cost range.

It actually makes sense to have a central databasee where you store the token counts and cost. Here is the code for a table to capture each call to the LLM and the token count and cost:

            use mydb;
drop table if exists token_cost;
create table token_cost
(
    id			bigint		auto_increment		not null	primary key,
    captured		timestamp				not null	default current_timestamp,
    captured_from	varchar(100)				not null,
    input_tokens	int	             			not null,
    thought_tokens	int					not null,
    output_tokens	int					not null,
    input_cost		double					not null,
    thought_cost	double					not null,
    output_cost		double					not null,
    index idx_token_cost_from (captured_from)
);

Note the use of an index on captured_from. Here is the code for a view to access the table:

            use mydb;
drop view if exists vw_token_cost;
create view vw_token_cost as
select id, captured, captured_from, input_tokens, thought_tokens, output_tokens,
    input_tokens + thought_tokens + output_tokens as total_tokens,
    input_cost, thought_cost, output_cost,
    input_cost + thought_cost + output_cost as total_cost
from token_cost;

A summary query can then be run against vw_token_cost to determine the live cost of each system.

            use mydb;
select captured_from, count(total_tokens) as token_count, sum(total_cost) as total_cost
from vw_token_cost
group by captured_from;

Output for my moder8 system in dev:

captured_from	token_count	total_cost
moder8	98	0.0067418

Future Enhancement

As a future enhancement you could create a table that defines the actual cost per 1M tokens for each system.