The wonder of it all: February 2024

I recently came across a beautiful puzzle on the YouTube channel of 3Blue1Brown. The puzzle was entertaingly discussed on Stand-up Maths but I confess that I needed to watch both these videos several times before I truly understood the solution. This post is to help reinforce my own understanding and to help those readers who did understand the puzzle but perhaps struggled with the solution.

For completeness here is a description of the puzzle.

The warden of a prison generously offers freedom to two prisoners, Joe and Donald, if they are able to find a solution to a certain puzzle. The puzzle consists of a room containing a chessboard, every square of which contains a coin, each coin being either heads or tails in some arbitrary pattern. Underneath one of the coins is hidden the key to their gaol. Joe will enter the room and will be shown which chessboard square hides the key. Joe must then flip one coin only and retire from the room. Donald will then enter the room and has to deduce the location of the key, thereby winning freedom for him and Joe.

That is the general format of the puzzle and the two inmates have to devise a strategy whereby Joe can reveal the location to Donald.

The problem, of course, is to work out how Joe can somehow manage to tell Donald the location of the key. They know nothing in advance of the disposition of heads and tails, and no further communication between them is allowed.

The language of binary digits, 0 and 1, is going to be convenient as we consider how to proceed. In this language every possible board configuration is equivalent to a vector of length 64 with entries 0 (Heads) and 1(Tails): a bit string of length 64. We could consider this bit string to be laid out 8 bits at a time according to the 8 rows of the chessboard. We can also consider it to be laid out as a long row of 64 bits. The two layouts are equivalent of course but either of them allows us to number the positions of the bit string with the numbers 0, 1, 2, ..., 63.

Notice that when Joe flips one of the coins the bit representing the coin's value changes from 0 to 1, or 1 to 0, and all other bits stay the same. In other words Joe has altered exactly one position of the length 64 bit string and, in the language of coding theory, the new bit string is distance 1 away from the original bit string in the Hamming metric (which measures the distance between two bit strings by the number of positions where they differ). If two bit strings differ in exactly one position we shall call them "neighbours".

So Joe encounters a bit string and a key position. He changes the bit string into one of the 64 possible other strings (neighbours) at Hamming distance 1. Donald now sees this new bit string and has to infer the key position.

Let's introduce a way of referring to a bit string and a position within it (the key location). We'll imagine 64 different colours that will be used for colouring the bit strings: if a bit string has a certain colour that colour picks out a position in the bit string.

So Joe, because he knows the key position, knows the colour of the bit string.

Now let's imagine something utterly extraordinary. We are going to imagine that Joe and Donald have together made a list of all possible bit strings and have given a colour (one of the 64 agreed colours) to each of them. So "all" that Joe has to do is to change the original bit string into a bit string at Hamming distance 1 which has the colour that reveals the key position. Donald then consults the list of all the bit strings and their colours that he and Joe agreed on, finds the colour of the bit string Joe has made for him, and triumphantly exposes the key.

Two considerations immediately come to the fore. The first is that making this list of the bit strings, with a colour for each, is practically infeasible since there are 2⁶⁴ of them. We'll worry about that later.

The second consideration is that Joe and Donald must have constructed a colouring with a very special property: the colours of the 64 neighbours of each bit string must account for all 64 colours. If there were a bit string b which, for example, had no neighbour coloured red, and Joe was presented with a board which defined the bit string b in which the key was hidden in the red position, he would be unable to change it into a bit string that was coloured red.

The fact that all the 64 neighbours (of every bit string) use up all 64 colours is equivalent to saying that the 64 neighbours all have different colours. This is a very strong property of Joe and Donald's colouring and, at the moment, we cannot be sure that such a colouring even exists!

We'll take a little diversion and explore this very point in the more general situation that our bit strings are of some length k, and we are hoping to colour them with k colours so that, for each bit string, its k neighbours are coloured differently (so all k colours occur, each once only). Let's fix on one of the colours Red, say, and let's count all ordered pairs (a, b) where A and B are neighbouring bit strings of length k where bit string A is coloured red. Let's give a name, r, to the number of bit strings coloured Red.

Now, there are r ways in which we might choose the first component a; and for each such choice there are k choices for the neighbouring second component b: rk pairs in all. But we can count these pairs another way. We could choose b first (in 2^k different ways); but, for a given b there is exactly one red neighbour a: 2^k pairs in all.

That means that rk = 2^k and that means that k must be a power of two. It also means that every colour occurs 2^k/k times.

We can return from this diversion somewhat emboldened. We have discovered that colourings of the type we are after can only exist if k is a power of two. In our case k=64 which is a power 2. However, we still are not guaranteed that a suitable colouring exists and now we must address this question (and also solve the consideration of infeasibility that we deferred our anxiety about previously).

What we need is a rule for colouring bit strings that is simple enough that both Joe and Donald can apply it, possibly with a small amount of calculation. Such a rule will obviate the need for a memorised table that gives a colour to each of the 2⁶⁴ bit strings. And, of course, this rule must, when applied to the 64 neighbours of each bit string, deliver the full set of all 64 colours.

Joe uses this rule to select that neighbour of the bit string representing the original board which has the colour that identifies the board position hiding the key. Donald uses the rule to compute the colour from the bit string that Joe leaves for him.

So now, all we need is the rule! To introduce the rule I shall take a greatly simplified version of the puzzle: the case k=4, there are 16 bit strings are of length 4, and so 4 colours are used. Now, right at the start, when I introduced bit strings for board positions I de-emphasised that they arose from an 8 by 8 board. It is convenient to return to the square layout, rather than vector layout. When k=4 we have a 2 by 2 board. Here are the 16 possible boards:

I am going to colour these boards with red, blue, green, brown (and looking ahead these colours will be encoded as 00, 01, 10, and 11). Here is the colouring

Each 2 by 2 board is given a colour according to the codes of the colours, and according to the parities of the initial row and initial column of the board. For example the board in the top right corner of the layout above has initial row 1 1 (parity 0) and initial column 1 0 (parity 1); so its colour code is 01 which is blue.

This colouring rule does ensure that, for each 2 by 2 board, the four boards at distance 1 from it have different colours. In other words, for every 2 by 2 board, we can, by altering exactly one position, alter the parities of its initial row and column, to any of the four parity pairs:- changing the parity of the bottom right bit does not alter the pair of parities, changing the parity of the top right bit changes only the parity of the initial row, changing the parity of the bottom left bit changes only the parity of the initial column, and changing the parity of the top left bit changes both the parities of the initial row and initial column.

This solves the k=4 problem. But it also points the way to how we might generalise. Firstly, the binary string approach looked effective (both for representing boards and for representing colours). Furthermore, parity was playing an important role. When we computed the parity of a binary string we summed its individual bits according to the rules 0+0=0, 0+1=1, 1+0=1, and 1+1=0. This unusual interpretation of addition is either considered to be "addition modulo 2" (if you are a mathematician) or "exclusive or" (if you are a computer scientist). At any rate this is what "+" will mean from now.

The new addition applies also to bit strings: we simply add up the corresponding components of the strings: thus 0110+1010 = 1100. It's easy to see that, when we add two bit strings, their parities add up in the same way: in other words, the parity of the sum is equal to the sum of the parities.

But now let's go back to our colour encoding rule when k=4. Because the colour of a 4-bit string is defined by the parities of two 2-bit strings we can deduce that, for every pair of 4-bit strings a and b, colour (a+b) = colour (a) + colour(b). In mathematical language: the function that maps 4-bit strings to their 2-bit colouring is linear.

This gives us the clue to how we can define a colouring of k-bit strings (boards) for every possible k (recall from our discussion above that k itself has to be a power of 2, k=2^t say, and colours will be t-bit strings).

Our colouring is going to be linear. That means that we only need to define the colour of the k special binary strings which have a single binary digit 1, and all other digits zero (we'll call these special strings the basic strings - mathematicians will know why). Then, we can extend the colouring to every k-bit string using the linearity. For example, when k=4, colour(1101) = colour(1000) + colour (0100) + colour (0001).

The k-neighbours of the binary string of all 0s are just the k basic strings, and we must therefore colour these with the k different colours. (Remember we are looking for colourings where the k neighbours of any k-bit strings are given different colours.) It follows from this that the k-neighbours of every k-bit string a will have different colours. This is because the neighbours of a are obtained by adding each basic string b to a and colour(a+b) = colour(a) + colour(b).

We shall number the components of each vector as 0, 1, ..., k-1 (rather than 1, 2, ..., k for a convenience that will emerge immediately). The k different special strings will be called e₀, e₁, ..., e_k-1 and we shall denote the colour of special string e_i by the t-bit value of i (this is the reason for numbering components from 0 rather than from 1).

Now let's start putting all this notation together to solve the puzzle and we shall demonstrate the solution procedure using an example with k=16 (so t=4). The example "board" we shall use is

b = (0 0 1 1, 1 0 1 0, 1 1 0 1, 0 0 0 1)

(for typographical reasons this is written as 16-vector, in groups of 4 for readability, but equally we could have represented it as a 4 by 4 array).

We'll assume the first prisoner, Joe, is shown that the key is hidden under position 13. Then we have

b = e₂ + e₃ + e₄ + e₆ + e₈ + e₉ + e₁₁ + e₁₅.

Therefore

colour(b) = colour(e₂) + colour(e₃) + colour(e₄) + colour(e₆) + colour(e₈)

+ colour( e₉) + colour(e₁₁) + colour(e₁₅)

= 0010 + 0011 + 0100 +0110 + 1000 + 1001 + 1011 + 1111

= 0110

In the language we have developed, Joe's task is to find one of the special vectors c_u and change b into d = b + c_u so that colour (d) = 1101 (the binary value of 13, the key position). Then Donald simply has to compute colour(d) to locate the key.

So Joe requires that

1101 = colour(d) = colour(b + c_u) = colour(b) + colour(c_u) = 0110 + colour(c_u).

Therefore colour(c_u) = 1101 - 0110 = 1011.

But this means that u=11, the value of the binary string 1011, and that is the board position that Joe changes.

We can verify this really is the right position by computing (as Donald will do) the colour of the new board d = (0 0 1 1, 1 0 1 0, 1 1 0 0, 0 0 0 1). This is simply

colour(e₂) + colour(e₃) + colour(e₄) + colour(e₆) + colour(e₈)

+ colour( e₉) + colour(e₁₅)

which is 0010 + 0011 + 0100 +0110 + 1000 + 1001 + 1111 = 1101, and this is the binary value of 13 as expected.

I hope it is clear that this example generalises. Therefore the general solution is as follows:

Joe computes the colour of the given board b as a t-bit binary number x
Joe converts the position of the key as a t-bit binary number y
Joe calculates z = x + y and flips the bit at position z of the board, giving a new board d
Donald computes the colour of the board d. This is a t-bit binary number w that identifies the position of the key.

In principle the problem is now complete, not only for the original 8 by 8 chessboard but for all boards with 2^k squares (rectangular boards included). However it is worth returning to the original 8 by 8 case to present the solution in a possibly simpler way to actually carry out. The solution involves some binary conversions and these cannot be finessed. The more complicated parts of the solutions are the two colour computations in steps 1 and 4.

We'll revert to the usual chessboard 8 by 8 layout with squares numbered 0 to 63 as shown:

which, with the positions represented by their 6-bit binary strings, is

and consider the computation of colour(b) for some board b = (b₀, b₁, ..., b₆₃). We are calculating a 6-bit binary string:

colour(b) = colour(b₀e₀ + b₁e₁ + ... +b₆₃e₆₃)

= b₀colour(e₀) + b₁colour(e₁) + ... +b₆₃colour(e₆₃)

and on the right-hand side of this equation we have summands b_icolour(e_i). However colour(e_i) is the 6 bit binary string whose value is i and we add these terms component by component.

Let's focus on the first component of these 6-bit strings. The only way in which one of the summands b_icolour(e_i) can contribute to the first component of the result is if b_i=1 and colour(e_i) has first component 1. But the latter condition says that colour(e_i) is a binary number between 32 and 63, that is, i is in the second half of the board. Thus to get the first component of the sum we are adding up (so getting its parity) all the bits in the lower half of the board. To summarise: we get the first bit of the sum by concentrating on the lower half of the board and computing its parity.

How about the second component of the sum? Here what matters is the set of all 6 bit binary numbers which have 1 as second component. These are the numbers 16 to 31 and 48 to 63: a region of the board represented by rows 2, 3, 6, 7 (numbering rows from 0) which look like two thick horizontal stripes. The parity of this region gives the second bit of the sum.

For the third component the relevant region is rows 1, 3, 5, 7 (again numbering from 0).

For the fourth component the region is the entire right half of the board, for the fifth it is columns 2, 3, 6, 7 (numbering columns from 0, and for the last component it is columns 1, 3, 5, 7.

To confirm that these are indeed the regions stated, refer to the array of binary strings above (or argue from first principles).

For convenience the 6 partitions of the board into regions are shown in the diagram below (where I have shaded the region whose parity must be computed).

So, if Joe and Donald want to demonstrate the original escape method, each of them must memorise the 6 diagrams above. When Joe sees the board he computes the 6 bits of its colour by considering each of the diagrams one by one. For each one he computes the parity of the shaded region obtaining a 6 bit colour. He adds the colour to the key position and determines which bit to flip. Donald does the same with the new board and its colour gives him the key position.

The wonder of it all

Monday 19 February 2024

The Impossible Chessboard Puzzle