Cracking KidRSA

11.2. Cracking KidRSA#

The security of KidRSA relies on that it is difficult for an attacker to determine \(d\) if they know \(e\) and \(n\). For small values of \(e\), \(d\), and \(n\), this is certainly not true since checking all possible values of \(d\) to see if \(1 \equiv e \cdot d \pmod{n}\).

For example, using the keys in the first KidRSA example in the previous section, the Python module timeit() and a single for loop in the function bruteForceKidRSA() can determine through the brute force method that \(d=73\) one hundred times in 0.0006409000000076048 seconds using only the information \(e = 103\) and \(n = 537\) on a non-specialized laptop computer in 2019.

import timeit

def bruteForceKidRSA( e, n):
    for d in range(0, e):
        if (d * e) % n == 1:
            return (d)

print(timeit.timeit( 'bruteForceKidRSA(103, 537)', number=100, setup="from __main__ import bruteForceKidRSA"))
print('d =', bruteForceKidRSA(103, 537))

0.0006409000000076048
d = 73

However, when keys get larger it does take quite a bit longer to brute force the value of \(d\).

public key	size of \(n\)	Seconds to brute force
\(\left( 55601, 3727687 \right)\)	22 bits	`0.0069977`
\(\left( 2279259, 316834949 \right)\)	29 bits	`0.3364858`
\(\left( 46508795, 17347912289 \right)\)	35 bits	`9.2515968`
\(\left( 171534261, 97603308101 \right)\)	37 bits	`30.9215976`
\(\left( 412356815, 312154666949 \right)\)	39 bits	`96.3439985`
\(\left( 1072651357, 1108049904433 \right)\)	41 bits	`239.9175115`
\(\left( 2722214743, 3873713550481 \right)\)	42 bits	`1008.8244588`

We can see from the table that for every additional few bits needed to represent the modulus \(n\), it takes much longer to brute force an inverse key. It wouldn’t take much to create a public key with very large integer values that would take your computer days, weeks, months, or years to brute force. It seems like a large key should be sufficient for this system to be secure (enough) for use with information that is time-sensitive. It wouldn’t matter if someone was able to find out your encrypted credit card information 10 years from now, because you’ll have already obtained a new card number by then!

However, while this system seems secure there’s a mathematical method that can determine multiplicative inverses, and thus private key values, very quickly and efficiently. This method is called the Extended Euclidean Algorithm.

The Extended Euclidean Algorithm#

The Extended Euclidean Algorithm is a method that can be used to quickly determine multiplicative inverses of each other. It’s a process that can be done almost entirely by hand, although we’ll see a Python function that can perform it much quicker.

The Algorithm#

Create a table with two rows and two columns. The first row and column should be labeled with the value of \(n\) and the second labeled with the value of \(e\). Fill in the field of the table as follows:

\[\begin{split} \begin{array}{c|cc} & n & e \\ \hline n & 1 & 0 \\ e & 0 & 1 \end{array} \end{split}\]

These rows are meant to represent that \(n = 1(n) + 0(e)\) and \(e = 0(n) + 1(e)\). We’ll use that information to generate other equivalent equations that may be more helpful than this starting information.

Much like when solving systems of linear equations, we can subtract off multiples of the bottom row from the previous row to generate additional rows in the table. We’ll use this step to generate additional rows, such that the values that the multiples of \(n\) and \(e\) sum to decrease. The goal is to continue generating new rows until eventually the last row starts with the value of \(1\). The number found in the second column, the column labeled \(e\), is the multiplicative inverse of \(e\) in modulus \(n\).

Working through the following example indicates why this must be true.

Example#

Let a kidRSA public key be \(\left( 707, 979 \right)\), meaning \(e = 707\) and \(n = 979\). Start by setting up the table.

\[\begin{split} \begin{array}{c|rr} & 979 & 707 \\ \hline 979 & 1 & 0 \\ 707 & 0 & 1 \end{array} \end{split}\]

Which represents that \(979 = 1(979) + 0(707)\) and \(707 = 0(979) + 1(707)\). We can subtract off the second row from the first row to obtain a new row in the table:

\[\begin{split} \begin{array}{c|rr} & 979 & 707 \\ \hline 979 & 1 & 0 \\ 707 & 0 & 1 \\ 272 & 1 & -1 \end{array} \end{split}\]

Which represents \(272 = 1(979) - 1 (707)\). We can then subtract off the third row twice from the second row to obtain a new row:

\[\begin{split} \begin{array}{c|rr} & 979 & 707 \\ \hline 979 & 1 & 0 \\ 707 & 0 & 1 \\ 272 & 1 & -1 \\ 163 & -2 & 3 \end{array} \end{split}\]

Which represents that \(163 = -2(979) + 3(707)\). Continuing to generate additional rows in this manner yields:

\[\begin{split} \begin{array}{c|rr} & 979 & 707 \\ \hline 979 & 1 & 0 \\ 707 & 0 & 1 \\ 272 & 1 & -1 \\ 163 & -2 & 3 \\ 109 & 3 & -4 \\ 54 & -5 & 7 \\ 1 & 13 & -18 \end{array} \end{split}\]

So we now know that \(1 = 13(979) - 18(707)\). We can rearrange this equation to obtain:

\[-18(707) = 1 - 13(979)\]

This implies that \(-18(707)\) is equal to \(1\) plus some multiple of \(979\) (which is our value of \(n\)). Next we can reduce the equation by modulus \(979\), to find an equivalent multiple between \(0\) and \(979\) to obtain the equation:

\[-18(707) \equiv 1 \pmod{979}\]

Which implies that \(-18\) is the multiplicative inverse of \(707\) in modulus \(979\). We typically write our keys as positive integers, so instead of \(-18\) we would write \(961\).

We know that that \(1 \equiv 961 \cdot 707 \pmod{979}\) making the private key to this system \(\left( 961, 979 \right)\).

Using Python#

This algorithm can reasonably be completed by hand, but with large key values it may take several hundred calculations. The mathematics behind the algorithm guarantees that it will never take more than \(2\log_2{n}\) rows to complete the algorithm. For example, even if \(n \approx 10^{200}\), it won’t take more than \(1329\) rows to determine the inverse, which is very reasonable for a computer to calculate quickly. We’ll see how with the Python functions below.

The function egcd() runs the Extended Euclidean Algorithm and returns the last row of the table.

def egcd(a, b):
    if a == 0:
        return (b, 0, 1)
    else:
        g, y, x = egcd(b % a, a)
        return (g, x - (b // a) * y, y)

print( egcd(707,979) )

(1, -18, 13)

The function modinv() takes the output of egcd() and returns only the multiplicative inverse.

def modinv(e, n):
    g, x, y = egcd(e, n)
    if g != 1:
        return ''
    else:
        return x % n

print( modinv( 707, 979) )

This relatively simple looking algorithm is rooted in some very important Number Theory concepts that prove that this method will always work. We won’t cover this theory, but if interested, you should pursue further study in Number Theory.

Efficiency of Extended Euclidean Algorithm#

The hope of this algorithm is to speed up the process of finding private keys from public keys. Let’s see how well this works compared to the brute force method already discussed.

public key	size of \(n\)	Seconds to compute w/ EEA
\(\left( 55601, 3727687 \right)\)	22 bits	`0.000001543`
\(\left( 2279259, 316834949 \right)\)	29 bits	`0.000001562`
\(\left( 46508795, 17347912289 \right)\)	35 bits	`0.000001735`
\(\left( 171534261, 97603308101 \right)\)	37 bits	`0.000001739`
\(\left( 412356815, 312154666949 \right)\)	39 bits	`0.000001745`
\(\left( 1072651357, 1108049904433 \right)\)	41 bits	`0.000001775`
\(\left( 2722214743, 3873713550481 \right)\)	42 bits	`0.000001932`

The Extended Euclidean Algorithm is incredibly efficient at determining private keys, taking fractions of a second regardless of how large the values are in the public key. This one piece of mathematics has made KidRSA very insecure for use. We’ll see that the real RSA algorithm is similar to KidRSA, but does not have a known mathematical algorithm that provides this type of shortcut for determining the private key from a public one.