A hands-on intro to RSA

A simple framework to explain and demonstrate how the most widely used encryption protocol works ... and how to break it!

by Dr. Tomás Silveira Salles

RSA is one of the most common encryption protocols today, and everyone in the world of IT should have at least a basic understanding of it. However, explaining only the mathematics involved makes the experience quite demotivating for a lot of people, so we decided to go for a hands-on approach and wrote a small framework called TextbookRSA that implements all the crucial steps, and that you can read and play around with.

TextbookRSA is written in swift and uses only its base standard library: Foundation. This means you can try it out on any Mac or Linux machine (as long as swift is installed).

As a bonus, we also included a working implementation of how quantum computers can be used to break RSA encryption, to fulfil a promise made in a previous article.

Warning: Not cryptographically secure

Keep out of the reach of your production code!

TextbookRSA was never meant to be safe for real-world applications, but rather to be easy to read and understand. The most obvious weakness is the fact that the public keys in this framework are at most 32 bits long (on 64-bit machines, and at most 16 bits long on 32-bit machines). This is because we didn't want to implement "big-integer" types and arithmetic functions. We see big-integer arithmetics as a separate problem that is independent of the RSA protocol, and it would distract the reader/user from the key concepts that we wanted to convey. Other weaknesses are for example the lack of protection against timing attacks, the lack of a check against weak prime pairs, and using electronic codebook as the block cypher mode.

On the other hand, the framework fulfils its purpose of demonstrating the main concepts, it's very easy to use and the code is quite clean and simple.

A quick dive into the RSA protocol and our framework

Note: If you're already familiar with RSA encryption, you can probably understand the project more quickly by looking at the README file and browsing through the sources.

To begin a message exchange through RSA, the receiver, usually named Bob in textbooks, chooses two prime numbers $p$ and $q$. Their product $N=pq$ is often called the RSA modulo, and is part of the public key, but the two prime numbers are private and Bob never shares them with anyone. In TextbookRSA the interface for this triple of numbers is defined in the protocol RSAKeysProtocol and we have only one implementation, named UIntRSAKeys (which uses UInt as the integer type). In the code, you can see that we use our method UInt.randomPrimeInRange(...) to choose $p$ and $q$, which picks random numbers in a given interval and tests whether they are prime. In real-world applications, where $p$ and $q$ must be hundreds or even thousands of bits long, it is unrealistic to check this with absolute certainty, but there are tests that can quickly tell with high probability whether the chosen numbers are prime or not.

The next step is to choose an exponent $e$ to be used for encryption, and which is the second and last part of the public key. It is important that $e$ not have any common factors with $\left(p-1\right)\left(q-1\right)$, but that is really the only condition. As far as is known, it does not matter if the same exponent is used many times or by many different people. Many implementations of RSA simply hard-code $e=3$, for example, because it is the smallest (and therefore the most efficient) exponent you can choose. An even more common value is $e=65537$, which has some mathematical properties that can be used to make encryption more efficient and is a prime number, which makes it unlikely to share factors with $\left(p-1\right)\left(q-1\right)$. In our library we stick to the original idea and choose $e$ at random. This is done by calling keys.generateEncryptionParameters(), which essentially returns the pair $\left(N,e\right)$.

Since the pair $\left(N,e\right)$ is public, Bob can transmit it to the message sender, usually named Alice in textbooks, through any public channel. To facilitate these transmissions, the relevant types in TextbookRSA implement the protocol Codable, which is a new and extremely helpful Swift 4 feature. This means you can easily convert these types into well known formats such as JSON and back.

Now Alice can encrypt her message by calling Encrypter.encrypt(message, parameters: parms). You will find the implementation of two overloads of this method in EncrypterProtocol+DecrypterProtocol.swift and Encrypter+Decrypter.swift. One of them expects message to be of type Data (so you can encrypt arbitrary files) and the other takes message of type String (which is easier to use for quick tests, for example on a Swift Playground).

The reason why Encrypter is defined in the directory ECB is because RSA really only encrypts individual integers, and only integers smaller than $N$. This would be very boring for a beginner trying to learn something by using our framework, so we added ECB (short for electronic codebook) as a way of chopping longer messages into  small blocks so that these blocks can be individually encrypted via RSA. Again, this is not a good idea in real-world applications, because when you use ECB, equal blocks are transformed into other equal blocks, which reveals patterns in the data. In actual encryption systems RSA would be used either for signing (where it is applied only to a short hash of the message) or to encrypt the private keys for a different protocol (such as AES-GCM), which would then be used to encrypt the actual message.

The RSA part of the encryption process is the transformation of each individual block. A block (i.e. a number) $b$ is transformed into the number ${b}^{e}modN$. This happens in the method UIntRSA.transform(...) and uses the class Math.PowerOracle, which is probably the most interesting piece of mathematics in the project. The numbers involved in this exponentiation can be very large, so we have to worry about two things: Computing the result efficiently and avoiding an overflow of our integer type. To do this, we use the classic approach of breaking the exponent into a sum of powers of 2 and calculating the exponentiation in several steps.

This may seem like a long detour to compute something simple, but in practice it is very efficient. The key point is that after each step we apply the modulo again, so we're always working with small numbers. This way we don't have to worry about overflows, and the process is also faster because small numbers are easier to multiply.

When Bob receives Alice's message, he can create a Decrypter object using his keys and call decrypter.decrypt(encryptedMessage) (if the original message was of type Data) or decrypter.decryptText(encryptedMessage) (if the original message was of type String). Once again, you'll find the code in the ECB directory because the decryption is applied to each block individually and then the blocks are put together to get the original message. The RSA protocol only defines what happens to a single block. Recall that for a block $b$, Alice sends Bob the number $c={b}^{e}modN$. To decrypt, Bob first computes the inverse of $e$ in the $mod\left(p-1\right)\left(q-1\right)$ number system. That means, Bob finds a number $d$ such that $edmod\left(p-1\right)\left(q-1\right)=1$. Then he computes ${c}^{d}modN$, and a well-known mathematical theorem guarantees that this number is exactly the original block $b$.

Finding the number $d$ is easy if you know the primes $p$ and $q$. This is done by calling keys.generateDecryptionParameters(...), which internally uses Euclides's Greatest Common Divisor algorithm (yes, the one you learned in school). But as far as we know it is very very hard if you don't know $p$ and $q$, and this is the assumption RSA is built on.

Breaking RSA with quantum computers

Earlier this year we published an article about quantum computers and mentioned that they can be used to break RSA encryption using Shor's algorithm. We also mentioned that this algorithm does not find the prime factors of the RSA modulo, which left a lot of people confused and waiting for a well-deserved explanation.

To be clear, quantum computers can factor large integers quickly, and Shor's algorithm can be used as part of the process. What we want to show is that Shor's algorithm alone is not a factoring routine, and that you don't have to factor the RSA modulo to decrypt messages.

What is Shor's algorithm for?

The famous quantum computing algorithm written by Peter Shor is neither a factorization nor a decryption algorithm. It computes the period of a specific function. The function is of the form $f\left(x\right)={c}^{x}modN$ where $c$ and $N$ are fixed and $x$ is the variable. As you increment $x$, this function takes on different values for a while, but eventually starts to repeat a pattern. The number of distinct values before the function starts to repeat itself is what is called its period, and is the output of Shor's algorithm. For example, if $c=2$ and $N=5$, the values will be (starting at $x=0$): 1, 2, 4, 3, 1, 2, 4, 3, ... so we can see that the period is 4. When $N$ is large, finding the period is extremely difficult, but a quantum computer can do it.

In our framework, due to the current lack of a quantum computer here at the office, we compute periods by trial and error, also called the brute force method, and therefore we can only do it for very small $N$. The class that takes care of this is the PeriodOracle. We also added an extension to UIntRSAKeys with the static method small(maxPrime:) (obviously not a part of the RSA protocol), which creates RSA keys with small prime numbers so that you can actually try out the PeriodOracle (check out Period/SmallKeys.swift).

How can the period be used for decryption?

Going back to our usual example, we have the original block $b$, the RSA modulo $N$ and the exponent $e$ and from these Alice computed the encrypted block $c={b}^{e}modN$. Notice that $N$, $e$ and $c$ are all public, so an attacker, usually named Eve in textbooks, would know them. If Eve used a quantum computer with Shor's algorithm, she could compute the period $r$ of the function $f\left(x\right)={c}^{x}modN$. Similarly to what Bob did for decryption, she would then compute the inverse of $e$ in the $modr$ number system. In other words, Eve would find a number $d$ such that $edmodr=1$ and then compute ${c}^{d}modN$, which is exactly the original block $b$.

This whole process is implemented by our class PeriodDecrypter, which has the same interface as Decrypter (namely the DecrypterProtocol), except you don't need the private keys to initialize it, but only the RSA modulo instead.

Using the framework

The home page of the project on GitHub gives a more detailed description of the public API, without going into implementation details or the theory of RSA encryption. It also shows how to install the framework using CocoaPods (Mac) or the Swift Package Manager (Mac and Linux). Finally, if you want to start testing right away, the repository also contains a Swift Playground where the entire public API is used.

Want to try it out?