Not one of these is the real thing. A forecast
isn't the atmosphere; a GPS estimate isn't the road. Each keeps the few factors that matter,
ignores the rest, and makes a prediction you can check against reality.
What makes a model worth trusting?
It predicts data it has never seen. (Anyone can describe the past.)
It runs on far fewer numbers than the real system has moving parts.
Those numbers mean something. Each one stands for a real idea you
can name.
If the numbers are meaningful, fitting the model lets you
describe reality in an interpretable way.
That is why math matters here: it lets psychology make measurements,
not just descriptions.
Information, then knowledge
Information
2,000 rows like:
trial RT choice signal
1 0.62s left weak
2 0.41s right strong
3 1.93s right none
...
True, complete, and almost impossible to think with.
Knowledge
Four numbers per person:
how sensitive, how cautious, how
biased, how much delay.
Compact enough to compare across people and conditions.
The model is the bridge from the left box to
the right one.
The decision you made about 200 times
You all ran this at ryguy.io/dots. A field of dots appears. Some
fraction of them drift together (right or
left); the rest just flicker at random. Your job: call the direction,
as fast and as accurately as you can.
When lots of dots agree, you're quick and correct.
When the signal is faint, you slow down, and you start making mistakes.
This is a standard lab task for studying
perceptual decisions, and it's been run on humans and monkeys for over thirty years.
Difficulty has a name: coherence
Coherence is just the fraction of dots moving together. It's the dial we turned
from trial to trial.
High coherence
→ → → → → ↑ → → → → → ↓
Obvious. Fast, accurate.
Low coherence
↑ ↘ → ↙ ← ↗ ↓ → ↘ ← ↑ ↗
Mostly noise. Slow, error-prone.
At zero coherence there's no signal at
all. Pure guessing. We'll see that show up in the data.
Why do hard decisions take longer?
Because you don't decide all at once. You gather evidence over time, and you
hold off until it tips far enough one way to commit.
Strong evidence reaches that threshold quickly. Fast, confident answers.
Weak evidence creeps up and keeps getting drowned out by noise. Slow
answers, and more of them wrong.
Here's the key: how much evidence you demand before you commit is a
real quantity. It sets your personal speed versus accuracy balance, and we can measure it.
The Drift Diffusion Model
Draw the building evidence as a marker. It starts in the middle and drifts toward one of two
boundaries, pushed by signal and shoved around by noise, until it reaches one. That's the
decision, and when it gets there is the reaction time.
drift (k)
how hard the evidence pulls. This is perceptual sensitivity.
boundary (B)
distance to the lines, i.e. how much certainty you demand.
start (a₀)
a lean toward one option before any evidence arrives.
delay (τ)
sensing and moving time that has nothing to do with deciding.
The math is just the story with rules
The cartoon becomes a model when we write exactly how the marker moves:
dX_t = v dt + dW_t
v = s * k * c^alpha
v dt is the signal pushing the marker.
dW_t is random noise jostling it around.
The choice happens when the marker hits 0 or B.
Useful shortcut: when there is no signal, mean reaction time is roughly
tau + B^2/4. So the data already whispers what the knobs might be.
Before I touch anything, a prediction
If I move the boundaries farther apart, what happens to
accuracy? What happens to speed?
Commit to an answer in your head. Then we'll go find out.
Second prediction: how close can we get?
Before fitting, I'll use the math shortcuts and the sliders to
guess the final parameters by hand, live.
If the hand-tuned values are close, the fitted numbers will feel less
mysterious: the structure was already visible in the data.
LIVE DEMO 1
Explore the model
I'll switch to the notebook tab, "Explore the model".
(Alt-Tab or ⌘-Tab to the Pluto window.)
Raise the boundary B. Accuracy goes up; speed goes down. (Did you call it?)
Set the signal c to 0. Performance flatlines at about 50%.
Then I'll jump to section 2 and tune the knobs against the real curves.
This is the speed-accuracy trade-off in one parameter.
Now for the real thing: your data
We are not going to set those knobs by hand. Instead we hand the
computer all 2,000 of your decisions and ask one question:
"What knob settings would make the model act like this
class?"
Finding those settings is called fitting. It's how we
estimate something we never measured, like how cautious each of you is.
How does "fitting" actually work?
No black box needed. The computer repeats the same question many times:
Start with some knob settings.
Ask: how likely is the real data if these settings were true?
Nudge the knobs to make the data more likely. Repeat thousands of times.
It keeps the settings that make your actual decisions the least surprising.
That's the whole idea behind a lot of modern statistics and machine learning.
LIVE DEMO 2
Fit it to the class
I'll walk through sections 1 to 9 in the notebook.
§2 Guess: I'll tune the knobs by hand and compare against a rough math estimate.
§3 Fit: the knobs land, and the prediction snaps onto the data.
§4 Curves: accuracy versus difficulty, plus why hard trials run slow.
§5 Comparison: does the extra parameter improve prediction enough to justify itself?
§7 Criticism: where is the model still wrong?
§8 Individuals: we'll compare cautious and fast-response strategies.
Do not just admire the fit. Attack it.
A serious model has to survive criticism.
Residuals ask: where is the red line systematically wrong?
AIC asks: did the extra parameter earn its keep?
Simulation asks: can the model generate new data that looks like ours?
The point is not just to make the line look good. The point is to find a simple story
that makes predictions, then look hard at where it fails.
Step back: what did we just do?
From nothing but button presses, we measured things you can't observe:
how cautious someone is, how sharp their perception is, how long their reaction delay runs.
One small model, a handful of numbers, described an entire class at once.
And it could regenerate data that looks just like yours.
That's the win for psychology and neuroscience: behavior became numbers
we can compare, criticize, and use to test different explanations of the mind.
This is a real research tool
The dots task and this model are standard tools in decision neuroscience.
Researchers have even found neurons whose activity looks like accumulating evidence.
The same model finds differences in ADHD, in aging, after brain injury.
Often the change isn't "worse senses." It's a shifted boundary.
Same behavior, different mechanism. The model is what tells them apart.
And the mindset travels
Simplify. Predict. Test. Keep what survives.
That loop is why math belongs inside psychology and neuroscience. It helps
us move from "that person seems cautious" to "this model predicts they set a higher boundary."
And it is not special to brains. The same loop shows up in climate science,
drug trials, public health, and engineering.
Once you know how to build and test models, you can bring that habit to a lot
of hard questions.
Questions
Would practice change drift rate?
Would sleep loss stretch reaction delay?
Would emphasizing accuracy move the boundary?
How would you design an experiment to find out?