ZIP Attacks with Reduced Known-Plaintext

ZIP Attacks with Reduced Known-Plaintext

Michael Stay

AccessData Corporation

2500 N. University Ave. Ste. 200

Provo, UT 84606

staym@accessdata.com

published by +Tsehp March 2001

Abstract. In [BK94] Biham and Kocher demonstrated that the PKZIP stream cipher was weak and presented an attack requiring thirteen bytes of plaintext. The deflate algorithm “zippers” now use to compress the plaintext before encryption makes it difficult to get known plaintext. We consider the problem of reducing the amount of known plaintext by finding other ways to filter key guesses. In most cases we can reduce the amount of known plaintext from the archived file to two or three bytes, depending on the zipper used and the number of files in the archive. For the most popular zippers on the Internet, there is a fast attack that does not require any information about the files in the archive.

1 Introduction

PKZIP is a compression / archival program created by Phil Katz. Katz had the foresight to document his file format completely in the file APPNOTE.TXT, distributed with every copy of PKZIP; there are now literally hundreds of “zipper” programs available, and the ZIP file format has become a de facto standard on the Internet.

In [BK94] Biham and Kocher demonstrated that the PKZIP stream cipher was weak and presented an attack requiring thirteen bytes of plaintext. Eight bytes of the plaintext must be contiguous and all of the bytes must be the text that was encrypted, which is usually compressed plaintext. [K92] shows that the compression method used at the time, implode, produces many predictable bytes suitable for mounting their attack.

Most zippers available today only implement one of the compression methods defined in APPNOTE.TXT, called deflate. This is in part due to [BK94]’s results, and in part to the better compression ratio it provides. Deflate uses Huffman coding followed by a variant of Lempel-Ziv. Once the dictionary reaches a certain size, the process starts over. It’s easy to see that the codes for any of the plaintext depend on a great deal of surrounding plaintext, and that there’s almost no way to get it right unless you have the beginning of the file. The difficulty of getting known plaintext was one reason Phil Zimmerman decided to use it in PGP [PGP98]. Practically speaking, if one has enough of the original file to get the thirteen bytes of plaintext required for the attack in [BK94], one has enough to break the encryption almost instantly.

Without the original file, all is not lost; we have the file’s type as indicated by its extension, and we have its size. We have at least one byte of known plaintext used for filtering incorrect passwords. Zippers also encrypt output from a pseudorandom number generator that may be vulnerable to attack.

It is the author’s opinion that the only reason the PKZIP cipher has held up so well in light of [BK94] is due to the deflate algorithm and the difficulty of getting enough plaintext. This paper treats the question of how far we can reduce the plaintext requirement and still break the cipher in a reasonable amount of time.

1.1 The PKZIP Stream Cipher

The PKZIP stream cipher was designed by Roger Schaffely and is fully described in the file APPNOTE.TXT found in most PKZIP distributions. The internal state of the cipher consists of three 32-bit words: key0, key1, and key2. These values are initialized to 0x12345678, 0x23456789, and 0x34567890, respectively. The internal state is updated by mixing in the next plaintext byte; we follow [BK94] in our description, combining the decrypt_byte() and update_keys() functions described in APPNOTE.TXT into a single function:

unsigned char PKZIP_stream_byte (unsigned char pt)

{

unsigned short temp;

key0 = crc32 (key0, pt);

key1 = (key1+(key0 & 0xFF)) * 0x08088405 + 1;

key2 = crc32 (key2, key1 >> 24);

temp = (key2 & 0xFFFC) | 2;

return ( (temp * (temp ^ 1)) >> 8)&0xFF;

}

where ^ is XOR, | is OR, & is AND, and >> is right shift.

For the purposes of this paper, we define crc32() to be

#define crc32(crc, b) ((crc >> 8) ^ crctab[(crc & 0xFF) ^ b])

The old CRC state is shifted right eight bits and XORed with the 32-bit entry of a byte-indexed table to produce the new state. The index is the low byte of the old state XORed with b. The function is linear; that is,

crctab[x ^ y] = crctab[x] ^ crctab[y].

The cipher is keyed by encrypting the user’s password and throwing away the corresponding stream bytes. The stream bytes produced after this point are XORed with the plaintext to produce the ciphertext.

The crux of all of our attacks is the fact that there is almost no diffusion in the internal state. Of the ninety-six bits of internal state, eight bits of key0 affect key1; eight bits of key1 affect key2; and fourteen bits of key2 affect the output stream byte.

1.2 Encrypted File Format

Zippers must prepend twelve bytes to the beginning of the file to be encrypted. The ZIP file format specifies that the first eleven should be random and that the last should be the low byte of the archived file’s CRC. The entire CRC is stored in plaintext, and this byte serves as a password filter. Some zippers, like InfoZIP [IZ] and WinZip [WZ], store ten random bytes and the low two bytes of the CRC.

We assume that deflate was the algorithm used to compress the underlying data. The author did a crude statistical test on a few hundred files of varying types and sizes and found that given the file type, as indicated by its extension, and its size, one can guess about the first two and a half bytes of the compressed file. Since the checksum bytes are at the very end of the prepended header, we can use them to augment the plaintext from the file in mounting our attacks.

2 Biham and Kocher’s Attack

For completeness, we review [BK94]’s results. We begin with some terminology. Bits are numbered from right to left: bit 0 is the ones’ place, bit 1 is the twos’ place, bit 2 is the fours’ place, etc. Let p_i be the ith known-plaintext byte, i = 1, 2, 3, ... Let s_i be the ith stream byte. Let key0_i, key1_i, and key2_i be the value of key0, key1, and key2 after processing p_i. Note that s₁ is a function of the random header and the password; it is independent of the plaintext. In general, bits 2 through 15 of key2_i determine s_i+1.

Their attack proceeds as follows:

XOR ciphertext and known plaintext to get known stream bytes s₁ through s₁₃.

Guess 22 bits of key2₁₃.

This guess combined with s₁₃ is enough to fill in eight more bits of key2₁₃, for a total of thirty. s₁₂ provides enough information to derive 30 bits of key2₁₂ and the most significant of key1₁₃. In general, each stream byte s_iallows us to calculate thirty bits of key2_i-1 and the most significant byte of key1_i.

We continue using stream bytes to make a list of the most significant bytes of key1₁₃ through key1₈.

For each list, we find 2¹⁶ possibilities for the low 24 bits of key1₁₃ through key1₉ by calculating the low byte of (key1_i+LSB(key0_i+1)) such that you get the right high byte of key1_i+1.

From each of the 2¹⁶ lists of complete key1’s, derive the low bytes of key0₁₃ through key0₁₀.

Once we have the low bytes of key0₁₀, key0₁₁, key0₁₂, and key0₁₃, we can use our knowledge of the plaintext bytes to invert the CRC function, since it’s linear, and find the complete internal state at one point along the encryption.

Once we have the complete internal state, we can decrypt backwards as far as we want; we decrypt the ciphertext corresponding to p₁ through p₅ and filter out wrong keys.

We can break a file with work equivalent to encrypting around 2³⁸ bytes and negligible memory. We need a total of thirteen bytes of known plaintext: eight for the attack, and five to filter the 2³⁸ keys that remain.

2.1 Minor Improvement in the Amount of Plaintext Required

[BK94] throws away six bits in key1₇. By using them, we can reduce the plaintext requirement to twelve bytes at the cost of increasing the work factor by four.

2.2 More Files in the Archive

If we have more than one file in the archive, we can make the reasonable assumption that they were encrypted with the same password. “Zippers” encrypt at least one check byte into every encrypted file to verify that the user entered the correct password. Once we have the complete internal state of the cipher, we can run it backwards to the beginning of the file and read out key0, key1, and key2. Since this state is the same at the beginning of each file (it only depends on the password), we can decrypt the check byte in each file and use it to filter with instead of known plaintext from a single file. This also works if the files are in different archives, but have the same password.

Since the attack is so fast, we can afford to guess a couple of stream bytes. If we guess N of them, we get 2^{40 + 8N} keys to filter; this increases the amount of work and the number of additional files we need, but reduces the amount of known plaintext required from a single file.

Consider the N=1 case: if the file was created in a zipper with two checksum bytes, we can break the file with work equivalent to encrypting 2⁴⁸ bytes. We need two checksum bytes followed by only four more known plaintext bytes in one file, and three other files in the archive (six check bytes) to filter the 2⁴⁸ possible keys.

If there is only one checksum byte, we can break the file with the same amount of work, but we need seven files in the archive and five bytes of known plaintext in addition to the checksum byte.

3 Divide and Conquer

The limited diffusion of the internal state prompts us to ask how much of the state we need to guess to process one byte. If it is small enough, we can guess it and filter out keys that won’t work with our known stream bytes, then proceed to the next part.

It turns out that we can get by with as few as 23 bits. Note that we don’t need to guess 16 bits of key0₀to calculate the low byte of key0₁: if we distribute the XOR in the definition of crc32(), we see that we only need to guess 8 bits of crc32(key0₀, 0):

LSB(key0₁) = LSB(crc32(key0₀, p₁))

= LSB(crc32(key0₀, 0)) ^ LSB(crctab[p₁]).

Now we distribute the multiplication across the addition in the next step:

MSB(key1₁) = MSB( (key1₀ + LSB(key0₁)) * 0x08088405 + 1 )

(A) = MSB(LSB(key0₁) * 0x08088405) +

(B) MSB(key1₀ * 0x08088405) + possible carry bit.

We separate the equation into parts we know (A) and parts we need to guess (B), and find we need to guess nine bits, including a possible carry bit. Note that since we know the low bits of (LSB(key0₁) * 0x08088405), the carry bit will usually give us more than one bit of information in the form of an upper or lower bound on the rest of (key1₀ * 0x08088405) that we haven’t guessed yet.

Given a stream byte s_i+1, we can find sixty four values for bits 2..15 of key2_i. It’s easy to see why: fourteen bits of key2_i produce eight bits of s_i+1, so there are six left over. We can create a table of 256 x 64 bytes such that given s_i+1 and bits 10..15 of key2_i, we can look up bits 2..9 of key2_i. We call this the preimage table.

We guess bits 10..15 of crc32(key2₀, 0) and use s₂, the preimage table, and crctab[MSB(key1₁)] to find bits 2..9 of crc32(key2₀, 0). We end up with 2²³ key guesses.

To find the next part of the internal state, we have to guess about the same amount. This guess is not illustrated, but we basically guess about eight more bits of information in each of the three keys. The only complicated part is separating what we know about key1 from what we don’t.

We guess bits 8..15 of crc32(key0₀, 0) directly; the next guess involving key1 is a little more complicated:

MSB(key1₂) = MSB( (key1₁ + LSB(key0₂)) * 0x08088405 + 1 )

= MSB( LSB(key0₂) * 0x08088405 ) +

MSB( key1₁ * 0x08088405 ) + possible carry bit

= MSB( LSB(key0₂) * 0x08088405 ) +

MSB( (key1₀ + LSB(key0₁)) * 0xD4652819 ) +

possible carry bit

(A) = MSB( LSB(key0₂) * 0x08088405 ) +

(A) MSB( LSB(key0₁) * 0xD4652819 ) +

(B) MSB( key1₀ * 0xD4652819 ) + possible carry bit.

Again, (A) is known and we have to guess (B) nine bits, including a possible carry bit. The carry bit establishes an upper or lower bound on (key1₀ * 0xD4652819). We end this filter by guessing bits 16..23 and bits 0..1 of crc32(key2₀, 0) and calculating a stream byte. We have guessed 27 more bits, but the output byte has to match s₃, so we expect 2^23+27-8 = 2⁴² key guesses to pass this filter.

At this point, we have guessed 24 bits of crc32(key2₀, 0) and we know s₁. From this we can calculate, on average, one full value of key2₀. There are also only around 2¹³ possibilities for key1₀ due to the restrictions from the carry bits. So the third stage consists of guessing bits 16..23 of crc32(key0₀, 0) and running through the 2¹³ possible values for key1₀. We expect 2^42+13+8-8 = 2⁵⁵ key pieces to pass this filter.

Finally, we guess the last eight bits of key0₀ and we have a complete internal state. We will have 2⁶³ complete keys to filter with other bytes, whether they are in the archived file or in checksum bytes in other files. The cost is approximately the same as encrypting 2⁶³ bytes under the stream cipher. The plaintext requirement is four bytes total; at least one of these may come from the file’s own check byte(s).

This is 128 times faster than guessing three stream bytes and using [BK94].

4 Random Number Generation

InfoZIP is a cross-platform freeware zipper distribution. Because the C source code is readily available and is free, it forms the basis of most non-PKZIP zippers, including the very popular WinZip and NetZip^*.

APPNOTE.TXT does not specify how to generate the prepended random bytes; it only says that they are used to scramble the internal state of the cipher and are discarded after decryption. InfoZIP implements it as follows:

srand(time(NULL) ^ getpid())

For each file in the order they are stored,

Generate ten random bytes by calling rand() ten times and discarding all but the high eight bits of each return value.

Initialize the cipher with the password.

Encrypt the ten random bytes.

Append the low two bytes of the checksum.

Reinitialize the cipher with the password and encrypt the twelve-byte header and the compressed file.

rand() is usually implemented as a truncated linear congruential generator. WinZip and NetZip use Microsoft Visual C++’s implementation, which has a 31-bit seed:

unsigned long seed;

void srand( unsigned long s ) { seed = s; }

unsigned short rand()

{

seed = 0x343FD * seed + 0x269EC3;

return ( ( seed >> 16) & 0x7FFF );

}

Let r_i,
j be the jth random byte in the ith archived file; i, j = 1, 2, 3, ... Note that the internal state of the cipher is the same both times r_{i, 1} is encrypted. Since XOR is its own inverse, r_{i, 1} is decrypted for all i. Also, every r_1,
j reveals the high eight bits of the internal state of the random number generator.

Since rand() is linear, we can compute two new constants for a generator such that it outputs every tenth output of the original. We know the upper eight bits of the generator, so we guess the low 23 bits and start generating every tenth output and comparing them to the leaked bytes. Four archived files suffice to uniquely determine the seed that was used in the random number generator, and therefore every r_i,j.

Let us emphasize that we do not have known plaintext at this point, in the sense that [BK94] requires, because the plaintext was encrypted twice, and we do not know the actual values of the stream bytes. What we can derive is the XOR of the stream bytes in the first and second encryption.

5 Parallel Divide and Conquer

We can adapt the divide-and-conquer algorithm from section 3 to use this information. Once we know the “random” headers, we can exploit the fact that the internal state was the same at the beginning of each embedded file and filter guesses with multiple known plaintext bytes in parallel, instead of being restricted to one byte as in section 3.

Let s_i,j,k be the jth stream byte of the kth encryption of the bytes in the ith archived file; i, j = 1, 2, 3, ...; k = 1, 2. We guess the same 23 bits as in section 3, but since we don’t know the actual value of s_1,2,1, we have to guess it, too. It is equivalent, and more convenient, to guess bits 2..9 of crc32(key2₀, 0). Now we have a prediction for s_1,2,1, and can derive s_1,2,2. We don’t have any information at all about s_i,1,1, since it’s the same as s_i,1,2 and cancels out. We guess it, and check to see that the second encryption spits out s_1,2,2. We have to guess a carry bit for the second encryption, too, so of the 2^23+8+8+1 = 2⁴⁰ key guesses, we expect 2^40-8 = 2³² key pieces to pass this filter on this file.

We want to filter out all but the correct guess at this stage; fortunately, we know that the state we are trying to guess was the same at the start of each encryption. We have an eight-bit value to filter with in each file, s_i,2,1 XOR s_i,2,2, but we also guess two carry bits, so with four more files in the archive, we can reduce the number of false positives to around 2^{32 - 6*4} = 256. Note that we now have ten carry bits putting restrictions on key1₀ instead of just one.

We continue to the next byte of each file. This time we guess the same 26 bits as in section 3 plus two carry bits, one for each encryption. With five files, we have 30 bits to filter with. We expect that 2^8+26+2-30 = 2⁶ = 64 key guesses survive the second stage. Total work for this stage is 2⁸⁺²⁶⁺² = 2³⁶ byte encryptions.

At this point we can derive key2₀ as before. Due to all the carry bit restrictions, we only have on the order of 2⁸ possible key1₀’s. We guess eight more bits of crc32(key1₀, 0) and run through all the remaining key1₀’s. Since we aren’t guessing carry bits any more, we have 40 bits to filter with. 2^6+16-40 < 1, and we expect that only the correct guess survives. Finally, we guess the last eight bits of crc32(key1₀, 0) and only the correct guess survives.

Experimentally, we have found that a key guess passes the second stage only if our guess for s_i,1,1 is correct. This usually occurs about one quarter of the way through the first 40-bit keyspace. After that, we only try one value for s_i,1,1 instead of 256 and the rest of the attack takes at most a few minutes.

The work done in the first stage dwarfs the rest of the work needed. The total work is therefore about the same as encrypting 2³⁹ bytes. Cracking a file created with this kind of weak PRNG usually takes about two hours on a 500 MHz Pentium II. One can then take the three keys and use [BK94]’s second algorithm to derive a password, if one desires, although the three keys suffice to decrypt the files.

6 Conclusion

The PKZIP stream cipher is very weak. The deflate algorithm makes it harder to get plaintext, but in most cases we can reduce the plaintext requirement to the point where one can guess enough plaintext based on file type and size alone. The most popular zippers on the internet are also susceptible to an attack that runs in two hours on a single PC based on known plaintext provided by the application and independent of the archived files themselves.

Bibliography

[BK94] Biham, Eli and Paul Kocher. “A Known Plaintext Attack on the PKZIP Stream Cipher.” Fast Software Encryption 2, Proceedings of the Leuven Workshop, LNCS 1008, December 1994.

[DL] http://download.cnet.com/downloads/0,10151,0-10097-

106-0-1-5,00.html?tag=st.dl.10097_106_1.lst.lst&

[IZ] ftp://ftp.freesoftware.com/pub/infozip/

[K92] Kocher, Paul. ZIPCRACK 2.00 Documentation. 1992.

http://www.bokler.com/bokler/zipcrack.txt

[PGP98] User’s Guide, Version 6.0. Network Associates, Inc., 1998. p.145.

http://www.nai.com

[WZ] http://www.winzip.com