hobby 2020

Colour-preference predictor

A self-directed experiment from upper-secondary school. Built a 20-question colour survey, shipped it to ~160,000 inboxes, trained a model on the 20,000 completed responses, and ended up with a small neural net that could guess a participant's age, gender, and self-reported mood from nothing but their click pattern.

The question

Does which colours you prefer, and how you click on them, carry enough signal to predict who you are? I had a hunch the answer was yes, and I wanted the dataset to be big enough that the answer wasn't a coincidence.

The survey

Participants answered three demographic questions (age, gender, and a 1–10 mood rating), then completed twenty trials. Each trial presented four colour swatches and asked them to pick the one they liked best.

Alongside the picks, the survey quietly logged the meta-signal: how long each response took, where on the swatch the click landed, whether the cursor wavered, and the order participants tended to scan the options.

Reach & response

~160,000 mails delivered through the Oslo school directory
~20,000 completed surveys (≈12.5% completion)
Every response stored with a timestamp and a session-local salt

At sixteen, the most surprising part wasn't the model. It was watching a dataset of twenty thousand human responses arrive in two days.

The model

Features

Per-trial: chosen colour (one-hot), three rejected colours, dwell time, click coordinates relative to swatch centre
Per-session: total time, time variance, first-click bias

Architecture

A small fully-connected network. Three targets, trained jointly: age (regression), gender (binary classification), mood (1–10 ordinal). I used the meta-features as much as the colour picks themselves; those turned out to carry a lot of the signal.

What I learned

Two things. First, that the way you click is at least as informative as what you click. Second, that running a study at scale is less about the model and more about the boring engineering: surveys, storage, deduplication, abuse-handling, a clean export pipeline. The model was the smallest file in the project.

I'd do this differently now: clearer consent, better aggregation, a published write-up. It was a high-school project and it shows, but it taught me that data beats cleverness, and I've never quite let go of that.

Back to timeline