Token

What Is a Token?

In the context of artificial intelligence and Natural Language Processing (NLP), a token is a unit of text that the model reads and processes. Tokens can be individual words, parts of words, punctuation marks, or even characters, depending on how the AI system is designed.

For example, the sentence “AI is evolving fast.” might be broken into the tokens: ["AI", " is", " evolving", " fast", "."]
Understanding how tokens work is essential for effectively using and optimising AI tools, especially Large Language Models like GPT-4.

Why Tokens Matter in AI

AI language models don’t “read” whole paragraphs the way a human does—they break input down into tokens and interpret them mathematically. Each token is turned into a numerical representation (embedding), which is then analysed by the model to generate a response or prediction.

The number of tokens affects:

How much context the model can handle at once (known as the context window).
How long an output can be.
How much you may be charged if you're using a commercial AI tool (since many charge based on token usage).

Types of Tokenisation

Different models use different tokenisation methods. For example:

Whitespace tokenisation splits text by spaces (common in simple NLP models).
Subword tokenisation breaks down uncommon words into more frequent components, allowing the model to handle rare or misspelled words effectively.
Byte Pair Encoding (BPE) is often used in modern LLMs and breaks text into the most efficient subword units for processing.

Example: The word "unhappiness" might be split into: ["un", "happiness"] or ["un", "happy", "ness"] depending on the model.

Tokens in Practice

Here’s how token usage appears in real-world AI applications:

1. Writing with AI: If you prompt an AI tool with a long query, the tool will process it as a series of tokens. A typical LLM might have a context window of 4,000–32,000 tokens, meaning there's a hard limit to how much text it can "remember" at once.

2. Billing and Pricing: Many AI platforms price requests based on token count. If an input prompt is 500 tokens and the output is 1,000 tokens, you’ll be charged for 1,500 tokens total.

3. Prompt Engineering: Understanding how your text is tokenised helps you write more efficient prompts and stay within token limits, especially when generating longer documents or summaries.

Common Misunderstandings

A token is not always a word. It could be part of a word, a symbol, or even just a space.
Token limits affect both input and output. You can’t just paste a full book into a prompt; token limits apply to the combined size of your question and the model’s answer.
Different models have different token behaviour. A prompt that fits within one model’s token limit may be too long for another.

Tokens are the fundamental building blocks of how AI models read, understand, and generate language. While often invisible to the end user, they directly impact performance, cost, accuracy, and length of AI-generated outputs.

Whether you're a developer, marketer, or writer using AI tools, understanding tokens will help you craft better prompts, manage costs, and get more precise results from your AI system.