If you’ve even only touched upon anything involving computers and colour, you’re sure to have encountered some way of modelling it. There’s Red Green Blue, Hue Saturation Lightness, L*ightness a*green-redness b*blue-yellowness (okay, that doesn’t work so well) and even Y luma In-phase Quadrature, amongst quite a few others. All specify a colour with three values, and the only exception I know of, Cyan Magenta Yellow Key, still has the CMY part roughly corresponding to three “axes” of colour. Why is that?

I’m far from an expert in the ways and specifics of colour science, but I’m going to put forward a hypothesis anyway: these tendencies relate far less to the nature of colour, and significantly more to (human) perception of it.

The nature of colour

First we must address what colour actually is in the physical world. The thing is, colour does not exist as a natural phenomenon: it is innately a result of human perception.

A single photon of light has only one distinguishing property: its frequency. Anything that emits or reflects light by whatever means is typically emitting millions of photons with all sorts of frequencies, some more prevalent than others.

The part that needs to be stressed here is that the frequency is anywhere along a linear spectrum. The problem for perception is that there are infinitely many possible frequencies with any number of relative intensities.

It is possible to measure frequencies individually – that’s exactly what a spectrometer does – but the course of natural selection did not produce an eye that sees the frequencies of light like this.

Colour perception

You might know that the retina contains three distinct types of colour receptors called cone cells. They are often erroneously labelled “red”, “green” and “blue”, but are more accurately referred to as “L”, “M” and “S” respectively. These letters stand for “long”, “medium” and “short”, the relative size of the wavelength they respond to most strongly.

Labelling them by colours is a misnomer because, due to the spectral nature of light, each is stimulated by a comparatively broad range of wavelengths.

A chart with wavelengths from 400 to 700 nanometres along the horizontal axis and values from 0 to 1 along the vertical axis. Three curves plot "S", "M" and "L", with S reaching 1 at about 450 nanometres, M at 550 and L at 570.
Normalised responsivity of human cone cells to different wavelengths (nm).
Image by Vanessaezekowitz at en.wikipedia [CC BY 3.0-2.5-2.0-1.0]

There are a few features of the chart to the right which contrast with a simplified explanation of visual perception. The L cone is far from a “red” receptor, instead responding more strongly to “green” frequencies and peaking adjacent to the M cone. It does respond to actual red, at the very right end of the chart, by the most of the three types, but still rather weakly.

The overlap amongst cells is actually crucial because, without it, colour perception would be no better than “there is some total amount of various longer wavelengths, and a smaller total amount of shorter wavelengths.” So instead the retina converts this into three values representing differences between receptors: because L and M are so close, it makes sense to see how they differ. Because S is so distinct from them, it is contrasted with L and M together. All three, together with the monochromatic rod cells, are combined to yield a measure of brightness. This whole mechanism is termed the opponent process and is our best understanding of visual perception that explains many phenomena.

What’s relevant is that, although it can distinguish a single frequency quite well, visual perception loses precision when it comes to combining frequencies. With only three types of cone, it’s trivial to arrange disparate sets of frequencies that result in the same differences between receptors and thus an identical perceived colour.

If this seems difficult to comprehend, just think of those with colour blindness: a deuteranope cannot distinguish wavelengths and combinations thereof that appear red from those that appear green, because without an M cone those colours are perceived in almost identical manners. Or perhaps the idea of “mixing” colours: shining a red and a green light together appears the same as an actual yellow frequency, but frequencies do not physically combine in this manner. Both scenarios are just perceived as the same colour.

The relevance to colour spaces and digital representation

Representing a colour as only three values is as incomplete as human perception itself – but that’s all that’s required. While a spectrometer or even a tetrachromatic bird might observe a significant difference between an object and the frequencies of the RGB filters in an LCD showing its image, a human eye does not. The significance of the red, green and blue colours chosen is simply that they stimulate the retina very similarly to the original light itself.

When you take a photo, the sensors in the camera lose a lot of information from the incoming light. But unless you plan to do spectrometric analysis on the image, what is preserved is all that will be required as long as the end goal is the human visual system.

The only significance of three-valued colour spaces is that three values are the minimum required to represent the range of ways a set of cells in the retina can be stimulated. CIELAB is closely analogous to this, but any other sufficiently-defined system of three values can define almost any point in what is effectively the three-dimensional colour space of human vision.