4.3 Randomness & Temperature

🎯 Core Goals

Understand the setting that controls how “creative” an LLM is.
Learn when to use high vs. low temperature.

“Temperature” is a setting you can tweak on most LLMs. A low temperature makes the LLM predictable but factual. A high temperature makes it creative but prone to going completely off the rails.

👁️ Visuals & Interactives

1. The Guessing Game

There's no single "right" answer. Which word works best?

"My favorite piece of tech is a" ___

2. Adjusting the "Creativity" Slider

Prompt: "Write a short story about a cat."

❄️ Predictable 🔥 Chaos

Temp 0.0 Temp 0.7 Temp 1.5

AI Output:

The cat sat on the mat. It was a very good cat. Every day, the cat slept.

📝 Key Concepts

The Dice Roll: When an LLM predicts the next word, it generates a list of possibilities (e.g., 90% chance of “apple”, 9% chance of “banana”, 1% chance of “shoe”).
Temperature 0.0 (Predictable): The LLM always picks the #1 most likely word. It becomes highly predictable, repetitive, and excellent for writing code or summarizing data.
Temperature 1.0 (Creative): The LLM is allowed to pick the 2nd, 3rd, or 4th most likely word. This introduces variety, making it great for brainstorming, poetry, and storytelling!
Probabilistic Nature: LLMs are fundamentally probabilistic — this is another reason they can’t be 100% reliable. There is almost always more than one “correct” way to finish a sentence.

If you set the temperature too high (e.g., 2.0), the LLM will start picking from a much wider, less predictable set of words, resulting in literal gibberish and broken grammar.

For most users: leave temperature alone. The default temperature setting is carefully chosen by the model’s creators. Tweaking it without understanding the task can noticeably degrade response quality.

Have you tried asking the LLM if it has a favorite number? Have you noticed if it gives you the same answer sometimes? Why is it?
Adjusting the temperature is a very good way to introduce more randomness in this scenario.

🧠 QUIZ

What does setting an LLM's temperature to 0 do?

Makes the LLM refuse to answer questions

Makes it always pick the most likely next word, producing consistent and predictable output

Makes the LLM generate shorter responses