12.2. Hamming distance#

Imagine you received the following message: “Meet me at thrre PM on the corner of Club and Ninth Stret.” You would easily be able to figure out that the person who sent this message meant to send “Meet me at three PM on the corner of Club and Ninth Street.” You are unlikely to misunderstand the message because there is no such English word as “thrre” or “Stret.” The list of all English words forms a dictionary and it is not the case that every string of five characters is an English word. Errors can be detected and, in some cases, corrected when the correct word requires only a change of a single character.

a handwritten note that says Meet me at thrre PM on the corner of Club and Ninth Stret.

Now, let’s simplify things a little bit. Let’s take A = {0, 1} to be our alphabet and C is a dictionary of four-character long code words. The Hamming distance is the number of places where two code words differ from each other. Therefore, the code word 1111 and 1111 have a Hamming distance of zero. The words 1110 and 1100 have a Hamming distance of one and the code words 1111 and 0000 have a Hamming distance of four.

The minimum distance of a code C is the smallest Hamming distance between two distinct code words, represented as d(“C”). The maximum distance if a code C is the maximum distance between two distinct code words.

A code can detect, for example, two errors if one can make at most two errors without landing on a different code word. This would be the case if the code had a minimum Hamming distance of three. In that case, a code maker would have to make at least three errors to land on another valid code word. Similarly, a code is k error detecting if at least (k + 1) errors have to be made to land on another valid code word.