What is Context Length?

LLM

Context Length

Definition

The maximum number of tokens an LLM can process in a single request, including both input prompt and generated output, determining how much information can be considered.

Context length (or context window) is the maximum number of tokens an LLM can process in a single request, encompassing both the input prompt and the generated output.

Why It Matters

Context length determines what you can accomplish in a single LLM call:

Document processing: How much text can be analyzed at once
Conversation history: How much chat context can be retained
RAG effectiveness: How many retrieved chunks can be included
Code understanding: How much codebase context fits in one prompt

Modern context lengths have expanded dramatically:

GPT-3 (2020): 4K tokens
GPT-4 (2023): 128K tokens
GPT-5 (2026): 400K+ tokens
Claude 4.5 (2026): 200K-1M tokens
Gemini 3 (2026): 2M tokens

Longer contexts enable new use cases but come with trade-offs in cost, latency, and attention quality.

Implementation Basics

Working with context length:

Token counting: Use tiktoken (OpenAI) or model-specific tokenizers
Budget allocation: Reserve space for output (typically 4K-8K tokens)
Truncation strategies: Remove oldest messages, summarize, or chunk

Context management patterns:

# Estimate token usage
import tiktoken
enc = tiktoken.encoding_for_model("gpt-4")
token_count = len(enc.encode(text))

Short context (4K-8K): Basic Q&A, simple tasks Medium context (32K-64K): Document analysis, multi-turn conversations Long context (128K+): Large document processing, codebase understanding

Considerations:

Cost: API pricing often includes input + output tokens
Latency: Longer prompts increase prefill time
Attention degradation: Models may lose focus with very long contexts
Lost in the middle: Information in the middle of long contexts is often missed

For most applications, design for efficient context use rather than relying on maximum length. Focused, relevant context beats raw volume.

Source

GPT-5 has a 400K+ token context window

https://platform.openai.com/docs/models

Why It Matters

Implementation Basics

🎁 Go Beyond Definitions

Related Terms

Related Articles