A hands-on intro to RSA
A simple framework to explain and demonstrate how the most widely used encryption protocol works ... and how to break it!
by Dr. Tomás Silveira Salles
9 minute read
RSA is one of the most common encryption protocols today, and everyone in the IT world should have at least a basic understanding of it. However, explaining only the mathematics involved makes the experience quite demotivating for a lot of people, so we decided to go for a hands-on approach and wrote a small framework called TextbookRSA that implements all the crucial steps, and that you can read and play around with.
TextbookRSA is written in swift and uses only its base standard library: Foundation. This means you can try it out on any Mac or Linux machine (as long as swift is installed).
As a bonus, we also included a working implementation of how quantum computers can be used to break RSA encryption, to fulfil a promise made in an article earlier this year.
TextbookRSA was never meant to be safe for real-world applications, but rather to be easy to read and understand. The most obvious weakness is the fact that the public keys in this framework are at most 32 bits long (on 64-bit machines, and at most 16 bits long on 32-bit machines). This is because we didn't want to implement "big-integer" types and arithmetic functions. We see big-integer arithmetics as a separate problem that is independent of the RSA protocol, and it would distract the reader/user from the key concepts that we wanted to convey. Other weaknesses are for example the lack of protection against timing attacks, the lack of a check against weak prime pairs, and using electronic codebook as the block cypher mode.
On the other hand, the framework fulfils its purpose of demonstrating the main concepts, it's very easy to use and the code is quite clean and simple.
Note: If you're already familiar with RSA encryption, you can probably understand the project more quickly by looking at the README file and browsing through the sources.
To begin a message exchange through RSA, the receiver, usually named Bob in textbooks, chooses two prime numbers *p* and *q*. Their product *N = pq* is often called the RSA modulo, and is part of the public key, but the two prime numbers are private and Bob never shares them with anyone. In TextbookRSA the interface for this triple of numbers is defined in the protocol `RSAKeysProtocol` in and we have only one implementation, named `UIntRSAKeys` (which uses `UInt` as the integer type). You can see that we use our method `UInt.randomPrimeInRange(...)` to choose *p* and *q*, which picks random numbers in a given interval and tests whether they are prime. In real-world applications, where *p* and *q* must be hundreds or even thousands of bits long, it is unrealistic to check this with absolute certainty, but there are tests that can quickly tell with high probability whether the chosen numbers are prime or not.
The next step is to choose an exponent *e* to be used for encryption, and which is the second and last part of the public key. It is important that *e* not have any common factors with *(p - 1)(q - 1)*, but that is really the only condition. As far as is known, it does not matter if the same exponent is used many times or by many different people. Many implementations of RSA simply hard-code *e = 3*, for example, because it is the smallest (and therefore the most efficient) exponent you can choose. An even more common value is *e = 65537*, which has some mathematical properties that can be used to make encryption more efficient, and is a prime number, which makes it unlikely to share factors with *(p - 1)(q - 1)*. In our library we stick to the original idea, which is to choose *e* at random. This is done by calling `keys.generateEncryptionParameters()`, which essentially returns the pair *(N, e)*.
Since the pair *(N, e)* is public, it can be sent from the receiver (Bob) to the sender, usually named Alice in textbooks, through any public channel. To facilitate these transmissions, the relevant types in TextbookRSA implement the protocol `Codable`, which is a new and extremely helpful Swift 4 feature. This means you can easily convert these types into well known formats such as JSON and back.
➕➖ Encoding of the RSA public key
There are several different official formats to encode RSA public keys, depending on the toolkit you use (for example OpenSSH or OpenSSL) but the essential content is always the same: the RSA modulo and the encryption exponent. Additionally, many formats include an identification string to document which algorithm is being used (to make sure you don't get your keys mixed up).
Now Alice can encrypt her message by calling `Encrypter.encrypt(message,parameters: parms)`. You will find the implementation of two overloads of this method in EncrypterProtocol+DecrypterProtocol.swift and Encrypter+Decrypter.swift. One of them expects `message` to be of type `Data` (so you can encrypt arbitrary files) and the other takes `message` of type `String` (which is easier to use for quick tests, for example on a Swift Playground).
The reason why `Encrypter` is defined in the directory ECB is because RSA really only encrypts individual integers, and only integers smaller than *N*. This would be very boring for a beginner trying to learn something by using our framework, so we added ECB (short for electronic codebook) as a way of chopping longer messages into small blocks so that these blocks can be individually encrypted via RSA. Again, this is not a good idea in real-world applications, because with ECB equal blocks are transformed into other equal blocks, which reveals patterns in the data. In actual encryption systems RSA would be used either for signing (where it is applied only to a short hash of the message) or to encrypt the private keys for a different protocol (such as AES-GCM), which would then be used to encrypt the actual message.
The RSA part of the encryption process is the transformation of each individual block. A block (i.e. a number) *b* is transformed into the number *b^e^ mod N*. This happens in the method `UIntRSA.transform(...)` and uses the class `Math.PowerOracle`, which is probably the most interesting piece of mathematics in the project. The numbers involved in this exponentiation can be very large, so we have to worry about two things: Computing the result efficiently and avoiding an overflow of our integer type. To do this, we use the classic approach of breaking the exponent into a sum of powers of 2 and calculating the exponentiation in several steps. This is easier to show than to explain:
➕➖ Easier done than said: Efficient modular exponentiation
Let's say we want to compute *7^10^ mod 11*. First we write 10 as a sum of powers of 2: *10 = 8 + 2*. This means that *7^10^* is the same as *7^8^7^2^*. Then we compute *7^2^ mod 11 = 5*. Now we know that *7^4^ mod 11 = (7^2^)^2^ mod 11*, so we substitute *5* for *7^2^* and get *7^4^ mod 11 = 5^2^ mod 11 = 3*. Similarly, we know that *7^8^ mod 11 = (7^4^)^2^ mod 11*, so we substitute *3* for *7^4^* and get *7^8^ mod 11 = 3^2^ mod 11 = 9*. Finally, to compute *7^10^ mod 11 = 7^8^7^2^ mod 11* we substitute *9* for *7^8^* and *5* for *7^2^* and get *7^10^ mod 11 = (9)(5) mod 11 = 1*.
This may seem like a long detour to compute something simple, but in practice it is very efficient. The key point is that after each step we apply the modulo again, so we're always working with small numbers. This way we don't have to worry about overflows, and the process is also faster because small numbers are easier to multiply.
When Bob receives Alice's message, he can create `Decrypter` object using his keys and call `decrypter.decrypt(encryptedMessage)` or `decrypter.decryptText(encryptedMessage)` depending on whether the original message was in `Data` or `String` form. Once again, you'll find the code in the ECB directory because the decryption is applied to each block individually and then the blocks are put together to get the original message. The RSA protocol only defines what happens to a single block. Recall that for a block *b*, Alice sends Bob the number *c = b^e^ mod N*. To decrypt, Bob first computes the inverse of *e* in the *mod (p - 1)(q - 1)* number system. That means, Bob finds a number *d* such that *ed mod (p - 1)(q - 1) = 1*. Then he computes *c^d^ mod N*, and a well-known mathematical theorem guarantees that this number is exactly the original block *b*.
Finding the number *d* is easy if you know the primes *p* and *q*. This is done by calling `keys.generateDecryptionParameters(...)`, which internally uses Euclides's Greatest Common Divisor algorithm (yes, the one you learned in school). But it is very very hard if you don't know *p* and *q*. This is where quantum computers can be used.
Earlier this year we published an article about quantum computers and mentioned that they can be used to break RSA encryption using Shor's algorithm. We also mentioned that this algorithm does not find the prime factors of the public key, which left a lot of people confused and waiting for a well-deserved explanation.
To be clear, quantum computers can factor large integers quickly, and Shor's algorithm can be used as part of the process. What we want to show is that Shor's algorithm alone is not a factoring routine, and that you don't have to factor the RSA modulo to decrypt messages.
Unfortunately, the explanation involves some tricky math, so to make things more interesting for everyone, we decided to go for a hands-on approach and wrote a small library called TextbookRSA that implements the crucial steps, and that you can read and play around with. It is written in swift and uses only its base standard library: Foundation. This means you can try out our framework on any Mac or Linux machine (as long as swift is installed).
The famous quantum computing algorithm written by Peter Shor is neither a factorization nor a decryption algorithm. It computes the period of a specific function. The function is of the form *f(x) = c^x^ mod N* where *c* and *N* are fixed and *x* is the variable. As you increment *x*, this function takes on different values for a while, but eventually starts to repeat a pattern. The number of distinct values before the function starts to repeat itself is what is called its period, and is the output of Shor's algorithm. For example, if *c = 2* and *N = 5*, the values will be (starting at *x = 0*): *1, 2, 4, 3, 1, 2, 4, 3, 1, ...* so we can see that the period is 4. When *N* is large, finding the period is extremely difficult, but a quantum computer can do it.
In our framework, due to the current lack of a quantum computer here at the office, we are computing periods by trial and error, also called the brute force method, and therefore we can only do it for small *N*. The class that takes care of this is the `PeriodOracle`. We also added an extension to `UIntRSAKeys` with the static method `small(maxPrime:)` (obviously not a part of the RSA protocol), which creates RSA keys with small prime numbers so that you can actually try out the `PeriodOracle` (check out Period/SmallKeys.swift).
Going back to our usual example, we have the original block *b*, the RSA modulo *N* and the exponent *e* and from these Alice computed the encrypted block *c = b^e^ mod N*. Notice that *N*, *e* and *c* are all public, so an attacker, usually named Eve in textbooks, would know them. If Eve used a quantum computer with Shor's algorithm, she could compute the period *r* of the function *f(x) = c^x^ mod N*. Similarly to what Bob did for decryption, she would then compute the inverse of *e* in the *mod r* number system. In other words, Eve would find a number *d* such that *ed mod r = 1* and then compute *c^d^ mod N = b^ed^ mod N*.
➕➖ Gimme more math. I can take it!
First, notice that . Since , there is some integer such that , so .
It is easy to see that the period with base is the same as the period with base , so . In particular, . Eve would have thus decrypted Alice's message and never knew or needed Bob's secret prime factors and .
This whole process is implemented by our class `PeriodDecrypter`, which has the same interface as `Decrypter` (namely the `DecrypterProtocol`), except you don't need the private keys to initialize it, but only the RSA modulo instead.
The home page of the project on GitHub describes it focusing solely on the public API, without going into implementation details or the theory of RSA encryption. It also shows how to install the framework using CocoaPods (Mac) or the Swift Package Manager (Mac and Linux). The repository also contains a Swift Playground where the entire API is already used, and which can serve as a starting point for your tests.
Want to try it out?
Go to GitHub