The wonder of it all

Monday, 15 July 2024

Problems with ChatGPT

I've been an occasional user of ChatGPT for about a year. It occasionally is truly impressive but today I had an interaction with it that, in many ways was both weirdly human and stubbornly wrong. I had been investigating a geometrical question:

What is the condition that the midpoints of a quadrilateral form a square?

I began by telling ChatGPT that I knew that the midpoints of a quadrilateral always formed a parallelogram. It agreed and gave me the proof that requires two applications of the midpoint theorem. Then came this dialogue

Me: For what type of quadrilateral is this parallelogram a square?

ChatGPT: The parallelogram formed by the midpoints of the sides of a quadrilateral is a square if and only if the original quadrilateral is a rectangle.

This was followed by several lines of proof which included the claim that the diagonals of a rectangle intersect at right angles. ChatGPT's claim is false and I pointed this out saying that it could not be true for long thin rectangles. So then....

ChatGPT: You are correct, and I apologize for the confusion. Let's revisit the conditions under which the parallelogram formed by the midpoints of a quadrilateral's sides is a square.

Then came...

ChatGPT: For the parallelogram formed by the midpoints of a quadrilateral's sides to be a square, the original quadrilateral must be an isosceles trapezoid.

More dialogue in which I again refuted this claim with long thin isosceles trapezoids to which I got:

ChatGPT: I apologize for the confusion earlier. You are right; neither rectangles nor isosceles trapezoids in general guarantee that the midpoints of their sides will form a square under all circumstances. Let's clarify the exact condition needed for the parallelogram formed by the midpoints to be a square.

ChatGPT: For the parallelogram formed by the midpoints of a quadrilateral to be a square, the quadrilateral must be a rhombus.

Another refutation by me and then ChatGPT returned to its original declaration that the quadrilateral had to be a rectangle. And finally...

ChatGPT: I appreciate your patience and the opportunity to clarify this correctly.

At that point I could not get much change in ChatGPT's replies. Its geometrical reasoning was superficially plausible in that it set out its arguments in a logical progression of steps. I have to confess I would have been hugely impressed had it got its logic correct. But the weirdly human aspect of it was its insistence, despite all the evidence, that it was right all along.

And I guess the moral of this interaction is that, despite its answers being couched in the most plausible of clothes, it was entirely wrong. And so, be careful, perhaps it is wrong lots of times.

Update 9 February 2025:

I returned to ChatGPT 6 months later and asked the same question. This time it was not only correct but actually extended my knowledge.

Friday, 12 July 2024

Napoleon's theorem

I'm an avid follower of Mathologer videos and his latest on the Petr-Douglas-Neumann theorem was well up to his high standards. I hope to write another post on that video but today I am commenting on a special case - Napoleon's theorem - that many others have also discussed.

Seemingly there is no definite connection with Napoleon Bonaparte, or any other Napoleon, but the theorem does seem to have made its debut in the mathematical literature in the early part of the nineteenth century. So, given that Napoleon had many renowned mathematical friends and was keenly interested in science, it is possible that a connection may exist.

In its basic form Napoleon's theorem is:

Let ABC be any triangle. Then the centroids of the 3 equilateral triangles with bases AB, BC, CA erected on the outside of triangle ABC form an equilateral triangle.

There are several proofs and generalisations. The proof that I give below is similar to the one here but handles one of the known generalisations of the theorem.

First some terminology.

We shall deem two triangles to be similar if one can be transformed into the other by a translation, scaling factor, and rotation. In particular this means that the set of interior angles of one triangle is the same as the other and these angles occur in the same order around their triangle. The second condition is not always part of the definition of similarity.

If we think of our similar triangles as lying in the plane where every point is a complex number then we have the following necessary and sufficient condition that the triangle with vertices (z₁, z₂, z₃) is similar to the triangle with vertices (w₁, w₂, w₃):

This (elementary) result is proved here which refers to G. H. Hardy's classic textbook. It follows immediately from properties of determinants that triangle z₁z₂z₃ is similar to every triangle pz₁+qw₁,pz₂+qw₂,pz₃+qw₃, with p and q not both zero. This extends to three (the case we shall use) or more similar triangles: if we have three similar triangles then any weighted sum of them will be a triangle similar to the summands.

Now we are ready to present the generalisation of Napoleon's theorem. We begin with an arbitrary triangle and erect, on each side, 3 similar triangles as shown below:

The diagram shows the three similar triangles in question as AC'B, CBA', B'AC. The angles and colours are intended to indicate the similarity correspondences. The three black dots are the centroids of the similar triangles.

So we have AC'B ~ CBA' ~ B'AC.

Now we take a weighted sum of these three triangles with weights 1/3, 1/3, 1/3. This gives a triangle with vertices (A+C+B')/3, (C'+B+A)/3, (B+A'+C)/3. From the discussion above this triangle is similar to the three original triangles. But these vertices are also the centroids of the original similar triangles.

I believe this proof is a lot more concise than most of the others and, of course, it proves something more general. One of the interesting aspects of the proof is that the technique of coordinatising the plane was not discovered until Descartes; so the proof could not have been discovered by the classical geometers.

In the case that the 3 similar triangles are equilateral this gives Napoleon's theorem. But different choices for the weights other than 1/3, 1/3, 1/3 give other interesting results. For example with weights 1/2, 0, 1/2 we get a triangle where the vertices are the midpoints of C'B, BA', AC (the midpoints of the green edges); this too is similar to the original similar triangles.

In the case that the triangles are equilateral we get another equilateral triangle.

Thursday, 18 April 2024

Conway's circle theorem

In my previous post on Three Youtube videos I discussed the Mathologer video which states and gives an elegant proof of John Conway's circle theorem. In this post I want to give a more pedestrian proof that requires only simple angle chasing. Some of the video commenters appear to have found similar proofs so my contribution may simply be to present a proof accompanied by diagrams to make it easy to follow along.

For completeness I'll state the theorem. It concerns an arbitrary triangle where, in this diagram, I colour the triangle sides red, green, and blue. The triangle sides are extended by red, green, and blue lines as shown; all red segments are of the same length and, similarly the blue segments and the green segments.

Conway's theorem is that the 6 endpoints lie on a circle and, moreover, the centre of this circle coincides with the in-centre of the original triangle. As shown below.

Let's label the three interior angles of the triangle as p-2a, p-2b, p-2c (p being my typographical simplification for pi). Then, of course,

(p-2a) + (p-2b) + (p-2c) = p

and so

a+b+c = p.

We'll draw a containing hexagon:

You'll see that I've labelled some angles around the hexagon. Their values follow directly from the fact that the diagram has 6 isosceles triangles: their bases are the edges of the hexagon.

In the quadrilateral ABCD there are two opposite angles a and b+c, which sum to p (pi). Thus ABCD is cyclic: A, B, C, D lie on a circle. Thus the unique circle through A, B, C also passes through D. Similarly this is true for the quadrilaterals BCDE and CDEF. Hence all of A, B, C, D, E, F lie on a common circle.

The centre of this circle lies on the perpendicular bisectors of the chords AB, CD, EF. Obviously (isosceles triangles again), each bisector passes through one of the triangle vertices, bisecting the angle there. Therefore these bisectors meet at the centre of the in-circle of the triangle.

Saturday, 13 April 2024

Three Youtube videos

When Youtube was launched in 2005 I didn't expect that an internet service for amateurs to share home videos would be more than a niche interest. I was totally wrong. In less than a year it was attracting 100 million views per day and in December 2006 it was acquired by Google. Nowadays Youtube is one of the first ports of call when searching the internet with videos on topics ranging from "How to fit a mirror to a bicycle" to full-length courses on Moral Philosophy.

In this post I want to discuss three videos I enjoyed today (see how my conversion is complete!) which are loosely linked in that they challenge mainstream societal ideas.

The first of these is a 10 minute presentation by Zach and Kelly Weinersmith on the feasibility of creating settlements on other worlds. I came across this video by accident (a common Youtube occurrence) because I follow Zach's clever daily cartoon "Saturday Morning Breakfast Cereal".

The Weinersmiths have just published a book "A City on Mars". Apparently they began the project with an optimism that one might summarize with "Yes, there are technical problems, but human beings are good at solving such problems. Let's look at what has to be done." However, in their video today they admitted to a complete volte-face. As a result of their research for the book they are now extremely pessimistic about whether these problems will be solved any time soon. Their conclusions are argued at greater detail in their book but this video is a punchy summary of what they found.

I'm pretty sure that scientists working on how human beings can live on Mars or the Moon are well aware of all the difficulties. However, outside that community of experts, I think people generally would presume that our aspirations to colonise other worlds are not just pie in the sky (pun intended). Who can blame them when our entertainment culture routinely serves up movies like "Don't Look Up".

So, all credit to Zach and Kelly for stepping up to fill a hole in the public understanding of science.

My second Youtube video is the latest in the excellent series of mathematics videos by the Mathologer Burkard Polster. It is very hard to make watchable mathematics videos because people's math backgrounds vary from the all too common "I've never got math. It's just not for me" through "I wish I'd paid more attention at high school and could appreciate some of the more important aspects" to "I love math but still find it hard". Polster's videos span a range within the second and third of these categories. He has an expositional style that is clear, conveys genuine enthusiasm without distracting witticisms, and uses computer animations brilliantly.

Today's video, entitled "Conway's Iris Problem", starts from an arbitrary triangle, constructs 6 additional points, and proves that these points lie on a circle. Simple hypotheses, a simple construction, and an easy to understand conclusion. What is not so simple is the reasoning towards the conclusion. We are conducted through two slick proofs supported by the usual helpful animations.

Mathematicians often use phrases like "This is a beautiful proof" which often mystifies a non-mathematician. The great challenge for a math educator is to awaken a sense of wonder in a student when understanding a clever proof. I don't know a better metaphor than "beauty" but I know it when I see it.

Anyway this demonstration is beautiful. Polster knows it and does a great job at helping viewers experience that pleasure.

Nowhere in the video (or in any of the Mathologer videos) does Polster say that what he is describing is useful or economically worthwhile. He is concerned solely with getting across the pleasure of understanding the reason a mathematical statement might be true. When I was a young student myself this was generally the spirit in which mathematics was taught. Things are different today. It has become commonplace for teachers to motivate mathematics by appealing to its utility in the real world. As a result it is widely believed in our society that mathematics is valuable because it is such an essential tool in engineering and science.

Of course we must not lose sight of this great utility of mathematics. But if that is the only reason we study it then something has very definitely been lost. For that reason I loved this video which celebrates the purity in Pure Mathematics.

The final video I'd like to discuss is by the physicist and philosopher Sabine Hossenfelder. I had seen some of her older videos and, while I admired her clarity of thought and precision of language, I had always wondered whether some of her negative comments about the scientific establishment might stem from career frustration. However her opening sentence in her latest video, in which she indicated sadly that her career as a physicist had transitioned into a Youtube poster, was so refreshingly candid that I was hooked immediately.

Hossenfelder gives an account of her education and career. She was a star student who thoroughly enjoyed her time as an undergraduate and it seemed to her that her intellectual life of a physicist would mirror the lives of physicists who had inspired her to begin with. However, despite having grades that a few years earlier would have earned her an academic position, she found it quite hard to make progress and embarked on a punishing life of going from one short-term research contract to another. For young scientists trained in the 1990s and later this is quite a common problem. This treadmill continued essentially till the pandemic arrived whereupon she changed her career and became a Youtuber.

During this time she came to realise that the model of academia was rather different to that of the great generation of physicists in the 1920s and 1930s and she bitterly summarizes her conclusions. A modern university department depends on research funding to survive. Permanent senior staff have to continually apply for research grants. These grants not only fund the planned research project but also come with an "overhead" component that pays for short-term researchers and contributes to the salaries of permanent university staff. Proposals to funding bodies have to be for projects that can be completed in a three to five year period and must not be too risky that they will frighten the funding body. A university department, if it is to flourish, must insist that its staff diligently seek out research funding - and this often influences how research proposals are formulated.

I was a lecturer and professor in four universities from 1970 to 2012 and watched this transition with increasing unease. I completely agree with Hossenfelder's analysis.

It seems that I am not the only one who thinks Hossenfelder's criticisms are valid. In the week since her video appeared it has attracted nearly thirty thousand (!) comments. I have been through some hundreds of these and have not come across any commenter who disagrees with her - in fact the vast majority of them echo her opinions with stories of their own.

This is a video which challenges the way universities (particularly science departments) operate. Hossenfelder has done an excellent job in bringing this to the attention of a large number of critics. It will not win her friends from the establishment but she can take comfort from the fact she has far more readers than even the most renowned senior researchers.

Saturday, 23 March 2024

The James Grimes card trick

I recently came across a Numberphile video posted by James Grimes which I found quite diverting. James described a numerical conjuring trick for which it was not entirely obvious why it worked. His trick (minus all the conjuring patter) goes as follows:

The performer presents 10 playing cards with face values 1, 2, ..., 10 to the spectator and invites them to divide the cards into two arbitrary subsets A and B of 5 cards each.
He then gets them to arrange A into an increasing sequence; and then to arrange B into a decreasing sequence. The second sequence is placed below the first.
Then he gets them to write down the 5 differences of corresponding terms in the two sequences (here "difference" means absolute value of the difference).
Finally he gets them to add these 5 numbers together.

Surprise, surprise: the answer is always 25.

James eventually gave the reason for this surprise. Writing the increasing and decreasing sequences as a₁ a₂ a₃ a₄ a₅, and b₁ b₂ b₃ b₄ b₅ it turns out that, for every pair a_i b_i, (whose difference is to be calculated) one of a_i and b_i is in the range 1 to 5, and the other is in the range 6 to 10.

To see this write L (low) for any of the values from 1 to 5 and H (high) for any of the values from 6 to 10. Suppose the first sequence has r low numbers and s=5-r high numbers so that it looks like L^rH^s. Then the second sequence has 5-r=s low numbers and 5-s=r high numbers so that it looks like H^rL^s. So each pair of positions must have an L and an H.

From this it follows that each difference has the form H-L and so their sum is the sum of the H's minus the sum of the L's. This comes to 25.

I have laboured this little argument for two reasons. The first is that I found it hard to find it myself and was embarrassed by its simplicity. The second is that, from it, one can see a much more general result.

Plainly the same type of argument will apply if one starts with the first 2n integers rather than 10. In this case it turns out that the sum of differences is n²which is rather neat. Also it applies if one starts with any collection of 2n numbers (and I said "collection" rather than "set" because repeats would be allowed). In that case the sum of differences is always the sum of the H's minus the sum of the L's whatever that turns out to be.

Both these observations were made by James himself. But even more is true. All that matters is that the two sequences are of the form L^rH^s and H^rL^s. This is a much weaker condition than that the first sequence is increasing and the second is decreasing. Even a weaker a condition can be required: so long as matching pairs comprise both an L and an H this will be enough.

In his video James asked for variations on his original trick. Here is one that exploits the more general case.

The performer presents a set of cards face down to one spectator and a second set to another spectator. Each spectator draws a card from his set and the two cards are given to the performer who places them also face down to form the first cards of a pair of rows. This is repeated until all the cards have been drawn and placed in the rows thereby producing two rows of cards. The cards are then turned over and the corresponding differences are calculated and summed. The performer predicts the sum.

Before performing the trick the performer has arranged that one set of cards contains low (L) cards and the other high (H) cards. The spectators will not notice this because the performer places each pair of cards randomly in either the first row or the second row; therefore when the cards are turned over all evidence of lowness and highness will have disappeared.

Thursday, 7 March 2024

Sums of Cubes

Monday, 19 February 2024

The Impossible Chessboard Puzzle

I recently came across a beautiful puzzle on the YouTube channel of 3Blue1Brown. The puzzle was entertaingly discussed on Stand-up Maths but I confess that I needed to watch both these videos several times before I truly understood the solution. This post is to help reinforce my own understanding and to help those readers who did understand the puzzle but perhaps struggled with the solution.

For completeness here is a description of the puzzle.

The warden of a prison generously offers freedom to two prisoners, Joe and Donald, if they are able to find a solution to a certain puzzle. The puzzle consists of a room containing a chessboard, every square of which contains a coin, each coin being either heads or tails in some arbitrary pattern. Underneath one of the coins is hidden the key to their gaol. Joe will enter the room and will be shown which chessboard square hides the key. Joe must then flip one coin only and retire from the room. Donald will then enter the room and has to deduce the location of the key, thereby winning freedom for him and Joe.

That is the general format of the puzzle and the two inmates have to devise a strategy whereby Joe can reveal the location to Donald.

The problem, of course, is to work out how Joe can somehow manage to tell Donald the location of the key. They know nothing in advance of the disposition of heads and tails, and no further communication between them is allowed.

The language of binary digits, 0 and 1, is going to be convenient as we consider how to proceed. In this language every possible board configuration is equivalent to a vector of length 64 with entries 0 (Heads) and 1(Tails): a bit string of length 64. We could consider this bit string to be laid out 8 bits at a time according to the 8 rows of the chessboard. We can also consider it to be laid out as a long row of 64 bits. The two layouts are equivalent of course but either of them allows us to number the positions of the bit string with the numbers 0, 1, 2, ..., 63.

Notice that when Joe flips one of the coins the bit representing the coin's value changes from 0 to 1, or 1 to 0, and all other bits stay the same. In other words Joe has altered exactly one position of the length 64 bit string and, in the language of coding theory, the new bit string is distance 1 away from the original bit string in the Hamming metric (which measures the distance between two bit strings by the number of positions where they differ). If two bit strings differ in exactly one position we shall call them "neighbours".

So Joe encounters a bit string and a key position. He changes the bit string into one of the 64 possible other strings (neighbours) at Hamming distance 1. Donald now sees this new bit string and has to infer the key position.

Let's introduce a way of referring to a bit string and a position within it (the key location). We'll imagine 64 different colours that will be used for colouring the bit strings: if a bit string has a certain colour that colour picks out a position in the bit string.

So Joe, because he knows the key position, knows the colour of the bit string.

Now let's imagine something utterly extraordinary. We are going to imagine that Joe and Donald have together made a list of all possible bit strings and have given a colour (one of the 64 agreed colours) to each of them. So "all" that Joe has to do is to change the original bit string into a bit string at Hamming distance 1 which has the colour that reveals the key position. Donald then consults the list of all the bit strings and their colours that he and Joe agreed on, finds the colour of the bit string Joe has made for him, and triumphantly exposes the key.

Two considerations immediately come to the fore. The first is that making this list of the bit strings, with a colour for each, is practically infeasible since there are 2⁶⁴ of them. We'll worry about that later.

The second consideration is that Joe and Donald must have constructed a colouring with a very special property: the colours of the 64 neighbours of each bit string must account for all 64 colours. If there were a bit string b which, for example, had no neighbour coloured red, and Joe was presented with a board which defined the bit string b in which the key was hidden in the red position, he would be unable to change it into a bit string that was coloured red.

The fact that all the 64 neighbours (of every bit string) use up all 64 colours is equivalent to saying that the 64 neighbours all have different colours. This is a very strong property of Joe and Donald's colouring and, at the moment, we cannot be sure that such a colouring even exists!

We'll take a little diversion and explore this very point in the more general situation that our bit strings are of some length k, and we are hoping to colour them with k colours so that, for each bit string, its k neighbours are coloured differently (so all k colours occur, each once only). Let's fix on one of the colours Red, say, and let's count all ordered pairs (a, b) where a and b are neighbouring bit strings of length k where bit string a is coloured red. Let's give a name, r, to the number of bit strings coloured Red.

Now, there are r ways in which we might choose the first component a; and for each such choice there are k choices for the neighbouring second component b: rk pairs in all. But we can count these pairs another way. We could choose b first (in 2^k different ways); but, for a given b there is exactly one red neighbour a: 2^k pairs in all.

That means that rk = 2^k and that means that k must be a power of two. It also means that every colour occurs 2^k/k times.

We can return from this diversion somewhat emboldened. We have discovered that colourings of the type we are after can only exist if k is a power of two. In our case k=64 which is a power 2. However, we still are not guaranteed that a suitable colouring exists and now we must address this question (and also solve the consideration of infeasibility that we deferred our anxiety about previously).

What we need is a rule for colouring bit strings that is simple enough that both Joe and Donald can apply it, possibly with a small amount of calculation. Such a rule will obviate the need for a memorised table that gives a colour to each of the 2⁶⁴ bit strings. And, of course, this rule must, when applied to the 64 neighbours of each bit string, deliver the full set of all 64 colours.

Joe uses this rule to select that neighbour of the bit string representing the original board which has the colour that identifies the board position hiding the key. Donald uses the rule to compute the colour from the bit string that Joe leaves for him.

So now, all we need is the rule! To introduce the rule I shall take a greatly simplified version of the puzzle: the case k=4, there are 16 bit strings are of length 4, and so 4 colours are used. Now, right at the start, when I introduced bit strings for board positions I de-emphasised that they arose from an 8 by 8 board. It is convenient to return to the square layout, rather than vector layout. When k=4 we have a 2 by 2 board. Here are the 16 possible boards:

I am going to colour these boards with red, blue, green, brown (and looking ahead these colours will be encoded as 00, 01, 10, and 11). Here is the colouring

Each 2 by 2 board is given a colour according to the codes of the colours, and according to the parities of the initial row and initial column of the board. For example the board in the top right corner of the layout above has initial row 1 1 (parity 0) and initial column 1 0 (parity 1); so its colour code is 01 which is blue.

This colouring rule does ensure that, for each 2 by 2 board, the four boards at distance 1 from it have different colours. In other words, for every 2 by 2 board, we can, by altering exactly one position, alter the parities of its initial row and column, to any of the four parity pairs:- changing the parity of the bottom right bit does not alter the pair of parities, changing the parity of the top right bit changes only the parity of the initial row, changing the parity of the bottom left bit changes only the parity of the initial column, and changing the parity of the top left bit changes both the parities of the initial row and initial column.

This solves the k=4 problem. But it also points the way to how we might generalise. Firstly, the binary string approach looked effective (both for representing boards and for representing colours). Furthermore, parity was playing an important role. When we computed the parity of a binary string we summed its individual bits according to the rules 0+0=0, 0+1=1, 1+0=1, and 1+1=0. This unusual interpretation of addition is either considered to be "addition modulo 2" (if you are a mathematician) or "exclusive or" (if you are a computer scientist). At any rate this is what "+" will mean from now.

The new addition applies also to bit strings: we simply add up the corresponding components of the strings: thus 0110+1010 = 1100. It's easy to see that, when we add two bit strings, their parities add up in the same way: in other words, the parity of the sum is equal to the sum of the parities.

But now let's go back to our colour encoding rule when k=4. Because the colour of a 4-bit string is defined by the parities of two 2-bit strings we can deduce that, for every pair of 4-bit strings a and b, colour (a+b) = colour (a) + colour(b). In mathematical language: the function that maps 4-bit strings to their 2-bit colouring is linear.

This gives us the clue to how we can define a colouring of k-bit strings (boards) for every possible k (recall from our discussion above that k itself has to be a power of 2, k=2^t say, and colours will be t-bit strings).

Our colouring is going to be linear. That means that we only need to define the colour of the k special binary strings which have a single binary digit 1, and all other digits zero (we'll call these special strings the basic strings - mathematicians will know why). Then, we can extend the colouring to every k-bit string using the linearity. For example, when k=4, colour(1101) = colour(1000) + colour (0100) + colour (0001).

The k-neighbours of the binary string of all 0s are just the k basic strings, and we must therefore colour these with the k different colours. (Remember we are looking for colourings where the k neighbours of any k-bit strings are given different colours.) It follows from this that the k-neighbours of every k-bit string a will have different colours. This is because the neighbours of a are obtained by adding each basic string b to a and colour(a+b) = colour(a) + colour(b).

We shall number the components of each vector as 0, 1, ..., k-1 (rather than 1, 2, ..., k for a convenience that will emerge immediately). The k different special strings will be called e₀, e₁, ..., e_k-1 and we shall denote the colour of special string e_i by the t-bit value of i (this is the reason for numbering components from 0 rather than from 1).

Now let's start putting all this notation together to solve the puzzle and we shall demonstrate the solution procedure using an example with k=16 (so t=4). The example "board" we shall use is

b = (0 0 1 1, 1 0 1 0, 1 1 0 1, 0 0 0 1)

(for typographical reasons this is written as a 16-vector, in groups of 4 for readability, but equally we could have represented it as a 4 by 4 array).

We'll assume the first prisoner, Joe, is shown that the key is hidden under position 13. Then we have

b = e₂ + e₃ + e₄ + e₆ + e₈ + e₉ + e₁₁ + e₁₅.

Therefore

colour(b) = colour(e₂) + colour(e₃) + colour(e₄) + colour(e₆) + colour(e₈)

+ colour( e₉) + colour(e₁₁) + colour(e₁₅)

= 0010 + 0011 + 0100 +0110 + 1000 + 1001 + 1011 + 1111

= 0110

In the language we have developed, Joe's task is to find one of the special vectors c_u and change b into d = b + c_u so that colour (d) = 1101 (the binary value of 13, the key position). Then Donald simply has to compute colour(d) to locate the key.

So Joe requires that

1101 = colour(d) = colour(b + c_u) = colour(b) + colour(c_u) = 0110 + colour(c_u).

Therefore colour(c_u) = 1101 - 0110 = 1011.

But this means that u=11, the value of the binary string 1011, and that is the board position that Joe changes.

We can verify this really is the right position by computing (as Donald will do) the colour of the new board d = (0 0 1 1, 1 0 1 0, 1 1 0 0, 0 0 0 1). This is simply

colour(e₂) + colour(e₃) + colour(e₄) + colour(e₆) + colour(e₈)

+ colour( e₉) + colour(e₁₅)

which is 0010 + 0011 + 0100 +0110 + 1000 + 1001 + 1111 = 1101, and this is the binary value of 13 as expected.

I hope it is clear that this example generalises. Therefore the general solution is as follows:

Joe computes the colour of the given board b as a t-bit binary number x
Joe converts the position of the key as a t-bit binary number y
Joe calculates z = x + y and flips the bit at position z of the board, giving a new board d
Donald computes the colour of the board d. This is a t-bit binary number w that identifies the position of the key.

In principle the problem is now complete, not only for the original 8 by 8 chessboard but for all boards with 2^k squares (rectangular boards included). However it is worth returning to the original 8 by 8 case to present the solution in a possibly simpler way to actually carry out. The solution involves some binary conversions and these cannot be finessed. The more complicated parts of the solutions are the two colour computations in steps 1 and 4.

We'll revert to the usual chessboard 8 by 8 layout with squares numbered 0 to 63 as shown:

which, with the positions represented by their 6-bit binary strings, is

and consider the computation of colour(b) for some board b = (b₀, b₁, ..., b₆₃). We are calculating a 6-bit binary string:

colour(b) = colour(b₀e₀ + b₁e₁ + ... +b₆₃e₆₃)

= b₀colour(e₀) + b₁colour(e₁) + ... +b₆₃colour(e₆₃)

and on the right-hand side of this equation we have summands b_icolour(e_i). However colour(e_i) is the 6 bit binary string whose value is i and we add these terms component by component.

Let's focus on the first component of these 6-bit strings. The only way in which one of the summands b_icolour(e_i) can contribute to the first component of the result is if b_i=1 and colour(e_i) has first component 1. But the latter condition says that colour(e_i) is a binary number between 32 and 63, that is, i is in the second half of the board. Thus to get the first component of the sum we are adding up (so getting its parity) all the bits in the lower half of the board. To summarise: we get the first bit of the sum by concentrating on the lower half of the board and computing its parity.

How about the second component of the sum? Here what matters is the set of all 6 bit binary numbers which have 1 as second component. These are the numbers 16 to 31 and 48 to 63: a region of the board represented by rows 2, 3, 6, 7 (numbering rows from 0) which look like two thick horizontal stripes. The parity of this region gives the second bit of the sum.

For the third component the relevant region is rows 1, 3, 5, 7 (again numbering from 0).

For the fourth component the region is the entire right half of the board, for the fifth it is columns 2, 3, 6, 7 (numbering columns from 0, and for the last component it is columns 1, 3, 5, 7.

To confirm that these are indeed the regions stated, refer to the array of binary strings above (or argue from first principles).

For convenience the 6 partitions of the board into regions are shown in the diagram below (where I have shaded the region whose parity must be computed).

So, if Joe and Donald want to demonstrate the original escape method, each of them must memorise the 6 diagrams above. When Joe sees the board he computes the 6 bits of its colour by considering each of the diagrams one by one. For each one he computes the parity of the shaded region obtaining a 6 bit colour. He adds the colour to the key position and determines which bit to flip. Donald does the same with the new board and its colour gives him the key position.