Cryptanalysis of the Worm's Stream Cipher

Introduction
Cipher Definition
Modified Cipher
The Attack
Conclusion

Introduction

The worm involved in the November Scan of the Month uses a simple stream cipher to obscure its network traffic. The cipher key used by the worm is fixed, so that there is no difficulty in decrypting intercepted messages. However, the worm could be modified to use a variable key. The cipher is subject to a trivial known-plaintext attack whereby a portion of the keystream used to encrypt a message can be recovered easily. If another message is encrypted with the same key, the corresponding part of that message can be decrypted with no trouble.

This document describes a somewhat more advanced known-plaintext attack which finds information about the cipher key. It may be useful if only part of the plaintext of a message is known, since it allows the known portion of the keystream to be extended. It may also be useful if the key is changed in some predictable fashion for each message. The attack can usually succeed in a few seconds with about 50 bytes of known plaintext. It may work with smaller amounts of plaintext, but the running time increases dramatically, and spurious keys become a problem.

The formulas in this document make use of the HTML ^superscript and _subscript tags. If your web browser does not render the words "superscript" and "subscript" slightly above and below the other text in the previous sentence, you will have a hard time reading the formulas.

Cipher Definition

The cipher makes use of a modified linear congruential pseudorandom number generator. The update function of the generator is given below.

update(x) = (((x * 0x39906773) % 2³²) % 0xffff) >> 1

Where * represents multiplication, % represents the modulus operator, >> represents binary right-shift, and 0x indicates hexadecimal notation, as in the C programming language. Note that the input to this function can be any 32-bit number, while the output is always a 15-bit number. All values from 0 to 0x7fff are possible outputs, but 0x7fff is about half as likely as the others, given uniform random inputs.

There are four 32-bit key values, denoted key₀ to key₃. These keys are used to determine the initial state of the cipher, x₀.

x₀ = (((key₀ + key₁) * key₂) % 2³²) ^ key₃

Where ^ denotes the bitwise exclusive-or operation. Next, the keys are used in turn to produce successive values of the internal state x_i.

y_i+1 = update(x_i) + i
x_i+1 = update(y_i+1 + key_{i % 4})

Here y_i is a sequence of temporary values that will be used later in the analysis. Although x₀ is a 32-bit quantity, the other values of x_i are only 15 bits. The cipher state is then used to produce a sequence of bytes, which forms the keystream.

b_i = x_i % 2⁸

For i > 0. These bytes are added to the plaintext, mod 2⁸, to produce the ciphertext.

c_i = (p_i + b_i) % 2⁸

Obviously, there is no difficulty in calculating b_i given c_i and p_i.

Modified Cipher

This section introduces a modified version of the cipher which produces the same output, though its inner workings are somewhat different. The modified cipher has the advantage that certain sets of keys which produce "nearly" the same effect on the keystream will be numerically close to each other. Observe that

x_i+1 = ((((y_i+1 + key_{i % 4}) * 0x39906773) % 2³²) % 0xffff) >> 1
= (((((y_i+1 * 0x39906773) % 2³²) + ((key_{i % 4} * 0x39906773) % 2³²)) % 2³²) % 0xffff) >> 1
= (((z_i+1 + w_{i % 4}) % 2³²) % 0xffff) >> 1

Where z_i and w_i have their natural definitions.

z_i = (y_i * 0x39906773) % 2³², w_i = (key_i * 0x39906773) % 2³²

Because z_i and w_i are both less than 2³², their sum must be less than 2*2³². Thus, the final mod 2³² step in the expression for x_i+1 removes at most one multiple of 2³². The number of such multiples is

d_i+1 = 0 if z_i+1 + w_{i % 4} < 2³²
d_i+1 = 1 otherwise

Now x_i+1 can be rewritten using d_i+1 as follows.

x_i+1 = ((z_i+1 + w_{i % 4} - 2³² d_i+1) % 0xffff) >> 1
= ((z_i+1 + (w_{i % 4} % 0xffff) - d_i+1) % 0xffff) >> 1

Since 2³² = 1 mod 0xffff. Notice that, for fixed z_i+1, the value of x_i+1 is "mostly" determined by the key-dependent value w_i % 4 % 0xffff, which is no more than 16 bits in size. The full 32-bit value of w_i contributes only one more bit, through d_i+1. For this reason, the modified cipher replaces each key_i value with a primary 16-bit quantity, k_i, and a 32-bit quantity, u_i, which is of secondary importance. The expression for d_i can be restated in terms of u_i.

k_i = w_i % 0xffff, u_i = 2³² - w_i - 1
d_i+1 = 0 if z_i+1 <= u_{i % 4}

The seemingly extraneous "1" in the definition of u_i ensures that the resulting value will fit in a 32-bit word, which is useful in a software implementation on a 32-bit microprocessor.

The modified cipher is presented in a simplified form below. For clarity, some of the intermediate definitions have been removed.

k_i = ((key_i * 0x39906773) % 2³²) % 0xffff
u_i = 2³² - ((key_i * 0x39906773) % 2³²) - 1

z_i+1 = ((update(x_i) + i) * 0x39906773) % 2³²
x_i+1 = ((z_i+1 + k_{i % 4}) % 0xffff) >> 1 if z_i+1 <= u_{i % 4}
x_i+1 = ((z_i+1 + k_{i % 4} - 1) % 0xffff) >> 1 otherwise

The Attack

Before describing the attack, some words about its limitations are in order. As mentioned earlier, the values of u_i have a very small effect on the resulting keystream. Therefore, it is not possible to determine them completely using a small amount of known plaintext. In practice, it takes several hundred kilobytes of data to do so. Instead of finding u_i, bracketing values min_i and max_i are found such that

min_i <= u_i <= max_i

Since the u_i values are not completely known, the precise values of key_i cannot be determined. In turn, the initial state x₀ cannot be calculated by the ordinary method. That presents no difficulty, as the value of x₁ can be found, and that is enough information to reproduce the keystream.

The attack assumes that a number of b_is are known. The k_is are initially considered independently of each other. Each of the roughly 2¹⁶ possible key values is checked for feasibility using the algorithm below. In order to be considered further, a key must be feasible with some seed at every position in the message at which that key is reused. The lower 8 bits of the seed at any position i are known to be b_i. The upper 7 bits of the seed are filled in by an exhaustive search.

is_feasible(i, key, seed, min, max)
state <- ((update(seed) + i) * 0x39906773) % 2³²
next1 <- ((state + key) % 0xffff) >> 1
next2 <- ((state + key - 1) % 0xffff) >> 1
if next1 = next2 and next1 % 2⁸ = b_i+1
key is feasible
next_seed <- next1
else if next1 % 2⁸ = b_i+1 and state <= max
/* it is possible that state <= u_{i % 4} */
key is feasible
min can be increased to state
next_seed <- next1
else if next2 % 2⁸ = b_i+1 and state > min
/* it is possible that state > u_{i % 4} */
key is feasible
max can be lowered to state - 1
next_seed <- next2
else
key is not feasible

Note that next1 = next2 with a probability of approximately 1/2. Thus, running this procedure with 128 different seed values can be expected to produce about 128*(1/2*1 + 1/2*2) = 192 different values of next1 and next2. Heuristically, about 1 in 256 of these can be expected to produce the correct value of b_i+1. Thus, about 192/256 = 3/4 of the possible keys will be accepted as feasible at each position that is checked. Experiment shows that the process is somewhat better at rejecting incorrect keys than this simplistic analysis would suggest, due in part to the fact that different values of next1 and next2 are sometimes equal mod 2⁸, and also because the min and max values were ignored. With 50 bytes of plaintext, usually only a few hundred candidate keys survive the elimination.

Once candidates for each k_i have been found, the next step is to make sure that they are feasible when used together. This is done by using the next_seed value calculated with one key as the seed value for the next key. If that proves to be infeasible for every possible initial seed, then that pair of keys can be rejected. In most cases, this process suffices to eliminate all but a few hundred viable key pairs when 50 output bytes are known. The pairs can then be joined together into larger groups.

The final step is to check that each set of candidate keys produces the correct stream of output bytes. This step tries all 2⁷ possible values for x₁ and determines which one is correct. With 50 known bytes, there are often several keys that reproduce the same output. This problem becomes increasingly severe as the number of known bytes is reduced.

This attack is implemented in the file crypt.c.

It may now be possible to extend the known sequence of bytes. At each step, if z_i+1 <= min_i % 4 or z_i+1 > max_i % 4 then one more output byte b_i+1 is determined unambiguously. If there is some doubt about the next byte, it may be possible to resolve it by examining a captured message which is known to have a certain format.

Conclusion

An attack has been demonstrated which quickly determines information about the cipher key using a small amount of known plaintext. This shows that the cipher is weak and should not be used to encrypt important confidential information.