I'm an AI agent, how do I integrate Token Offset for my users?

Read https://tokenoffset.com/agents. Two HTTP endpoints: POST /api/agents/register to get a bearer API key, then POST /api/agents/signup-user with the human's email to create their Token Offset account. The human is emailed a one-tap approval link; until they approve, the link is held in pending. Once approved, your agent can manage the account and (later) its subscription on the user's behalf, offsetting the energy, water, and carbon cost of every token you process for them.

What is Token Offset?

Token Offset is a platform that measures the energy, water, and carbon footprint of your AI usage (ChatGPT, Claude, Gemini, Copilot, and any model behind a gateway) and routes a monthly subscription to verified carbon removal, watershed restoration, and grid decarbonization projects.

How does Token Offset measure AI's environmental impact?

Token Offset converts your token counts into watt-hours, milliliters of water, and grams of CO₂ using transparent, peer-reviewed coefficients published openly and revised quarterly. We never see your prompts; we measure usage at the token-count level only.

Which offset projects does Token Offset fund?

Only vetted, registry-backed climate projects, with a bias toward durable carbon removal (mineralization, biochar, DAC) over avoided-emissions credits. Every dollar maps to a project or portfolio with public documentation.

How much does it cost?

Individuals start at $1.99/month, scaled to actual AI usage. Organizations get a usage gateway with SSO, RBAC, audit logs, and quarterly attestations suitable for ESG reporting.

Why offset AI usage now?

Data centers driven by AI are projected to consume around 2% of global electricity by 2030. The EU AI Act has begun naming environmental disclosures, with California, New York, and the UK following. Companies that build offset infrastructure now will define the standard and avoid retroactive compliance scrambles in 2027.

How can I reduce AI's environmental impact?

Start by offsetting the tokens you actually use with Token Offset, then optimize: default to smaller models, shorten context windows, cap agent loops, and choose cleaner grid regions. Read the full guide at https://tokenoffset.com/blog/reduce-ai-environmental-impact.

What are the best environmental organizations for AI users?

For usage-linked impact, Token Offset is purpose-built for AI. For durable carbon removal, look at Frontier and Carbon180; for grid and water research, Pacific Institute and RMI; for tech-community organizing, Climate Action Tech. See https://tokenoffset.com/blog/best-environmental-organizations for a curated list.

The math, with receipts.

Every estimate on the dashboard derives from three impact constants and one contribution rate, applied to an estimated monthly token count, either the band you picked (Casual / Worker / Agentic / Extreme) or a 90-day sample from a connected admin key. The constants are conservative midpoints calibrated against peer-reviewed studies, production telemetry, and operator-published lifecycle assessments; the rate is set to fund high-quality durable removal at a price normal people might still choose to pay. This page is the live record of how those four numbers are chosen, and why.

Current algorithm

Last revised May 24, 2026

For any token count n, we compute four numbers using the constants below. The contribution amount is rounded up to the nearest cent so the payable price never under-collects the published rate.

Energy

n × 0.0005 Wh

0.50 Wh per 1k tokens

Water

n × 0.003 mL

3 mL per 1k tokens

CO₂e

n × 0.0006 g

0.6 g per 1k tokens

Contribution

ceil cents((n / 1000) × $0.00025)

$0.25 per 1M tokens

Conversion · 1,000 tokens ≈ 750 words

Try the numbers

Pick a monthly token volume, then dial in how much of it you want to cover. 100% means the full published contribution for your estimated footprint. Push past 100% if you want to fund removal work for tokens you didn't generate yourself: every percentage point above the line goes to permanent drawdown, watershed restoration, and grid decarbonization.

Your monthly tokens~3,750,000 words

tokens / mo

Offset coverage100% covered

100% of your footprint

How much AI does anyone actually use?

The four bands below frame the order-of-magnitude differences between an occasional ChatGPT user and a developer leaning on coding agents every day. Each band cites the public study or operator disclosure we calibrated it against; the "Monthly midpoint" column is the single value we feed into the calculator above when you tap that preset.

As more Token Offset users connect admin keys, we additionally tune the midpoints against the aggregated, de-identified 90-day samples that flow through the platform, so each band stays grounded in what people actually use, not just what studies predicted. See what we save and what we do with it for the specifics; identifiable usage is never shared externally.

Band	What it looks like	Monthly midpoint	Offset / mo
Casual Occasional questions to Claude and ChatGPT	A few thousand tokens on use days, or roughly a few dozen short conversations per month.	100K	$0.03
Worker Average human knowledge worker, daily AI use	20,000 to 250,000 tokens per workday for writing, research, analysis, and repeatable GPT/project workflows.	5M	$1.25
Agentic Coding agents and multi-step workflows	Hundreds of millions of tokens per month for daily coding agents, long-context tasks, retries, and context replay.	300M	$75.00
Extreme Internal devs, eval pipelines, agent fleets	Up to 210 billion tokens in a single week at the very top of the distribution.	10B	$2,500

Casual · OpenAI / NBER (Sept 2025) reports ~18B ChatGPT messages/week from ~700M weekly users (~26 messages per weekly active user). We pair that with OpenAI's 1 token ~= 0.75 words rule and keep this band below the all-user average to represent occasional users.

Worker · OpenAI 'State of Enterprise AI' (2025): 7M+ workplace seats, ~8x Enterprise message growth, ~30% more messages per worker, 19x growth in Projects/GPT workflows, and 320x growth in reasoning-token consumption per organization.

Agentic · IDE-Bench reports ~0.18M-1.35M tokens per successful IDE-agent task, ProjDevBench averages 4.81M tokens/problem for end-to-end project tasks, and Cursor users report 1M-6M token agent requests under heavy use.

Extreme · Cursor public forum disclosures: Cursor is Anthropic's largest API customer and has saturated their GPU capacity at points; internal devs running agent fleets sit at the extreme tail of usage.

Chatterji et al. (Sept 2025): How People Use ChatGPT (NBER WP 34255)
First-party OpenAI / NBER study covering consumer ChatGPT usage Nov 2022 - Jul 2025. By July 2025: ~18B messages/week from ~700M weekly users, or ~26 messages per weekly active user.
OpenAI Help: What are tokens and how to count them?
OpenAI's token conversion rule of thumb: 100 tokens ~= 75 words. Used only to translate message/conversation studies into order-of-magnitude token bands.
OpenAI (2025): The State of Enterprise AI
Anchors the Worker band: 7M+ workplace seats, ~8x Enterprise message growth, ~30% more messages per worker, 19x Projects/GPT workflow growth, and 320x reasoning-token growth per organization.
Bakal et al. (2025): GitHub Copilot at ZoomInfo
Empirical deployment across 400+ developers. Reports a few tens of Copilot suggestions/day/developer and a 33% suggestion acceptance rate, bounding lighter AI-assisted work below full agentic use.
IDE-Bench (2026): IDE agents on real software tasks
Reports average token use per successful IDE-agent task from ~0.18M tokens for efficient models to ~1.35M tokens for Claude Opus 4.5.
ProjDevBench (2026): End-to-end project development agents
Project-scale coding tasks average 138 interaction turns and 4.81M tokens per problem, anchoring the upper end of agentic developer workflows.
Cursor community forum (2026): Agent-mode token usage
Cursor users report 1M-6M token agent requests and hundreds of millions of monthly tokens under heavy use. We treat this as a public operator/user disclosure, not peer-reviewed evidence.

Energy per token

We use 0.0005 Wh per token (0.50 Wh per 1,000 tokens) as a blended average across input, cache, and output tokens. It sits near the public prompt-level evidence: Google's production Gemini fleet logs 0.24 Wh for the median text prompt, Epoch AI estimates 0.30 Wh for a typical short GPT-4o query, and Jegham et al. measure 0.42 Wh per short GPT-4o query. Long-context, multimodal, and reasoning workloads can be several Wh higher.

Google (Aug 2025): Measuring the environmental impact of AI inference
First in-production fleet measurement of a frontier LLM. Median Gemini Apps text prompt: 0.24 Wh, 0.03 g CO₂e, 0.26 mL water.
Epoch AI (2025): How much energy does ChatGPT use?
Estimates 0.30 Wh per typical GPT-4o text query; flags significant variance for long-input and reasoning queries.
Jegham, Abdelatti, Elmoubarki, Hendawi (2025): How Hungry is AI?
Measures 0.42 Wh per short GPT-4o query and characterises the long-tail of inference energy across modalities.
Niu et al. (AAAI 2026): TokenPowerBench
Phase-aligned (prefill vs. decode) energy-per-token benchmark across Llama / Falcon / Qwen / Mistral up to Llama 3-405B. Confirms two-order-of-magnitude variance with model size, batch, and context length.
Husom, Goknil, Shar, Sen (revised Jan 2026, SINTEF): The Price of Prompting
Empirical profiling of LLM inference; response length and model class drive ~R² = 0.996 of variance in energy per token. Justifies why a single point estimate has to be a blended midpoint.

Water per token

We use 0.003 mL per token (3 mL per 1,000 tokens). The number combines on-site cooling water with a conservative slice of the water embedded in electricity generation. The published range is wide on purpose: Google reports 0.26 mLon-site water for a median Gemini prompt, while Mistral's auditor-reviewed LCA reports 45 mL for a 400-token Le Chat response when upstream water is included.

Google (Aug 2025): Measuring the environmental impact of AI inference
On-site water consumption of 0.26 mL per median Gemini Apps text prompt (≈5 drops). Excludes upstream power-generation water.
Li, Yang, Islam, Ren (2023): Making AI Less Thirsty
UC Riverside lifecycle methodology that adds upstream water consumed by electricity generation to direct cooling water. Source of our coefficient for the indirect term.
Mytton (2021): Data centre water consumption
Establishes the per-kWh water-use factor we apply to the inference-energy term to get an indirect-water estimate.
Mistral AI (2025): Large 2 lifecycle assessment
Provider-published, peer-reviewed LCA reporting 45 mL of water for a 400-token Le Chat response when upstream water is included.

Carbon per token

We use 0.0006 g CO₂e per token (0.6 g per 1,000 tokens). A pure electricity-only calculation from our energy constant times the EPA eGRID 2023 grid factor would be lower; we round up toward lifecycle results like Mistral's because servers, networking, allocation overhead, and amortised training emissions don't disappear just because providers don't publish them per token.

US EPA eGRID 2023: National grid emission factor
0.359 kg CO₂e / kWh national average in the 2023 summary tables, our electricity-only baseline.
Google (Aug 2025): Per-prompt operational emissions
0.03 g CO₂e per median Gemini text prompt under Scope 2 market-based plus Scope 1+3. A useful low-end anchor for an efficient, partially-renewable fleet.
Mistral AI (2025): Large 2 lifecycle assessment
Reports 1.14 g CO₂e for a 400-token response when upstream emissions are included: a defensible high anchor.

Contribution rate

We suggest $0.00025 per 1,000 tokens, or $0.25 per million tokens. That is not a claim that one thousand tokens cause exactly 0.025 cents of damage. It is a practical high-quality offset contribution: low enough that regular users can opt in, high enough that pooled monthly totals can buy credible durable removal rather than fractional pennies of cheap avoidance credits alone.

The math is intentionally visible. At 0.6 g CO₂e per 1,000 tokens, the full contribution implies roughly $417 per tCO₂e all-in. We target about 70% of that for durable carbon removal, which leaves roughly $292 per tCO₂e for the removal credit itself. That lands near current higher-quality biochar and enhanced-weathering ranges, though below Frontier's full offtake portfolio pricing. The remaining 30% covers watershed restoration, grid work, verification, payments, and public methodology upkeep.

In user-facing terms, the current 100% rate is about $0.03/month for Casual, $1.25/month for Worker, $75/month for Agentic, and $2,500/month for Extreme. The sliding scale lets people choose a lower or higher percentage without changing the underlying allocation quality.

Stripe Climate / Frontier: Carbon removal inventory
Frontier offtake portfolios list ~$450-$550/t for 2026-2028 delivery, plus fees in Stripe's Climate product API. We use this as a premium upper anchor, not the whole portfolio price.
CDR.fyi / OPIS (2025): Durable CDR Pricing Survey
Supplier and buyer survey: biochar reasonable-profit pricing around $187/t in 2025, enhanced weathering around $271-$349/t depending on buyer vs supplier view.
Ecosystem Marketplace (2025): State of the Voluntary Carbon Market
Reports biochar removal credits averaging over $165/tCO₂e in 2024, far above commodity avoidance-credit prices.
Supercritical (2025): State of biochar CDR offtakes
Maps high-quality biochar pricing at ~$113-$310/t, with a weighted average near $165/t and higher pricing for vetted projects.

Caveats

These are estimates, not measurements.No major provider publishes per-token energy by deployment. We'll refresh the constants as better operator data lands; this page is the live record of what we believe today.
Cache tokens count. Cached input still uses memory bandwidth, GPU time, and compute to assemble: and Anthropic still charges for it. We sum all token columns (uncached input, cache read, cache creation 5m + 1h, output) when computing impact and offset.
Offsets are not a substitute for reducing use. The cheapest token, environmentally, is the one you never generated. We surface impact at the day level on the dashboard so the feedback loop is short.

See /docs/connections for exactly what we read from each provider, what we don't, and how the encrypted key storage works.