12.4. Parity and ISBN-numbers#

A simple way to check if a message has been transmitted correctly is to use a check digit. In the case of a binary code, a check digit can the final bit of a message and can be selected so that the XOR of all the bits is always zero. Alternatively, it is the XOR sum of all the non-check digits. For example, given the two code words below, the check bits would be 1 and 0 respectively. This kind of check digit is also called a parity bit.

1000-11100-0

Strategy 3#

Recall the 108 mathematicians standing in a line trying to guess the color of their own hat from only what the mathematicians behind them said and what they can see in front of them? The final strategy guarantees that all will guess their hat color correctly except for the last mathematician in the line. That mathematician counts up all the black hats and shouts “White!” if there is an even number of black hats and “Black!” if there is an odd number of black hats, then all the others will have the information they need to deduce their hat color.

If the second mathematician hears the one behind her yell “White!” and also can see an even number of black hats, then she can deduce that she is wearing a white hat. If she sees an odd number of black hats, then she knows she herself must be wearing the black hat missing from her field of view, but visible to the mathematician behind her. If, on the other hand, he yelled “Black!” (an odd number of black hats) and she can see an even number of black hats she can be sure she is wearing a black hat. But if she sees an odd number of black hats then she is seeing the same number of black hats that the mathematician behind her saw and therefore must be wearing a white hat.

Unfortunately, this kind of check bit would not detect the most common error made in copying, namely a transposition error, when two numbers are reversed.

A common code that employs a check digit which can detect a transposition error is the ISBN (International Standard Book Number) that appears on the back cover of nearly all published books. It is an evolution of the SBN (Standard Book Number), created by Gordon Foster, a professor at Trinity College in Dublin. This code was initially used by WHSmith, a British book and stationary company, before being adapted and adopted worldwide.

Take for example, this ISBN number from The Code Book, by Simon Singh.

0-385-49532-3

This is an ISBN-10 number. The first nine digits identify the book and the edition. The last digit is a check digit. It is calculated according to the following algorithm: Beginning with the left most number, each digit is multiplied by the integers from 10 to 2. The sum is calculated, and the check digit is the digit which will cause the sum to be equal to zero when reduced by modulo 11. If that “digit” is a 10, then it is represented by an X (the symbol for 10 in Roman numerals).

Thus, for our example above:

10 x 0 + 9 x 3 + 8 x 8 + 7 x 5 + 6 x 4 + 5 x 9 + 4 x 5 + 3 x 3 + 2 x 2 = 228

Since 11 x 21 = 231, 231 is zero when reduced modulo 11. So, our check digit should be three, (228 + 3 = 231) which, indeed it is.

The particular digits of the ISBN number represent particular regions, or areas of publication, particular publishers, and titles. By the early 2000s it was clear that availability of unique numbers would soon be limited. In order to solve this problem, books published after 2007 have an ISBN – 13 number, such as this one from a novel about the hazards of 3D printing:

978-0-525-52037-5

In this case, the check digit is calculated a little differently. The first digit is multiplied by 1 and the second by 3. The third is multiplied by one again and the fourth by three and so on until the 12th. Finally, the check digit is calculated so that the sum will be equal to zero when reduced by modulo 10.

1 x 9 + 3 x 7 + 1 x 8 + 3 x 0 + 1 x 5 + 3 x 2 + 1 x 5 + 3 x 5 + 1 x 2 + 3 x 0 + 1 x 3 + 3 x 7 = 95

Since 95 + 5 = 100 = 0 (mod 10), our check digit is 5.

Of course, using a single check digit like this is useless for correcting errors. Next, we look at a code that can detect as well as correct errors and which is not as costly in terms of extra bits as the Minority Report approach.