LLM

Temperature

Definition

Temperature is a parameter that controls LLM output randomness, where lower values (0-0.3) produce deterministic, focused responses and higher values (0.7-1.0) increase creativity and variation.

Why It Matters

Temperature is your reliability dial. At temperature 0, the model always picks the most likely next token, deterministic, consistent, but potentially repetitive. At temperature 1, it samples more broadly from probable tokens, varied, creative, but less predictable.

This matters enormously for production systems. Code generation, data extraction, and factual Q&A need low temperature (0-0.3) for consistency. Creative writing, brainstorming, and chat applications often benefit from higher temperatures (0.7-0.9).

For AI engineers, temperature is one of the first parameters to tune. Wrong temperature leads to either robotic repetition or unreliable randomness. Getting it right balances consistency with natural variation.

Implementation Basics

Typical Settings

0.0: Deterministic, same input always produces same output
0.2-0.3: Slight variation, mostly consistent (code, extraction)
0.5-0.7: Balanced (general chat, assistance)
0.8-1.0: More creative (writing, brainstorming)
>1.0: High randomness (usually too chaotic for production)

How It Works Before selecting each token, the model calculates probabilities for all possible next tokens. Temperature scales these probabilities:

Low temperature: High-probability tokens dominate
High temperature: Lower-probability tokens get more chance

Practical Guidelines

Structured output (JSON, code): Temperature 0-0.2
Factual answers: Temperature 0-0.3
Conversational responses: Temperature 0.5-0.7
Creative writing: Temperature 0.7-0.9

Testing Approach Run the same prompt 10 times at your chosen temperature. Inspect the variation. If outputs are too similar for your use case, raise temperature. If too unpredictable, lower it. Find the sweet spot for your specific task.

Interaction with Top-P Temperature and top_p both control randomness. Using both can create unexpected behavior. Pick one to adjust, keep the other at default. Most practitioners prefer temperature for its intuitive behavior.

Source

Temperature values between 0 and 2 control sampling randomness, with higher values making output more random and lower values more deterministic.

https://platform.openai.com/docs/api-reference/chat/create

Why It Matters

Implementation Basics

🎁 Go Beyond Definitions

Related Terms

Related Articles