ZIP
Attacks with Reduced Known-Plaintext
Michael Stay
AccessData Corporation
2500 N. University Ave. Ste. 200
Provo, UT 84606
staym@accessdata.com
Abstract. In [BK94] Biham and Kocher demonstrated
that the PKZIP stream cipher was weak and presented an attack requiring thirteen
bytes of plaintext. The deflate
algorithm “zippers” now use to compress the plaintext before encryption makes
it difficult to get known plaintext. We consider the problem of reducing the amount
of known plaintext by finding other ways to filter key guesses. In most cases we can reduce the amount of known
plaintext from the archived file to two or three bytes, depending on the zipper
used and the number of files in the archive. For the most popular zippers on the Internet, there is a fast attack
that does not require any information about the files in the archive.
1 Introduction
PKZIP
is a compression / archival program created by Phil Katz. Katz had the foresight to document his file
format completely in the file APPNOTE.TXT, distributed with every copy of
PKZIP; there are now literally hundreds of “zipper” programs available, and the
ZIP file format has become a de facto standard on the Internet.
In [BK94] Biham and Kocher demonstrated that the PKZIP
stream cipher was weak and presented an attack requiring thirteen bytes of
plaintext. Eight bytes of the plaintext
must be contiguous and all of the bytes must be the text that was encrypted,
which is usually compressed plaintext. [K92]
shows that the compression method used at the time, implode, produces
many predictable bytes suitable for mounting their attack.
Most zippers available today only implement one of
the compression methods defined in APPNOTE.TXT, called deflate. This is in part due to [BK94]’s results, and
in part to the better compression ratio it provides. Deflate uses Huffman coding followed by a variant of
Lempel-Ziv. Once the dictionary reaches
a certain size, the process starts over.
It’s easy to see that the codes for any of the plaintext depend on a
great deal of surrounding plaintext, and that there’s almost no way to get it
right unless you have the beginning of the file. The difficulty of getting known plaintext was one reason Phil
Zimmerman decided to use it in PGP [PGP98].
Practically speaking, if one has enough of the original file to get the
thirteen bytes of plaintext required for the attack in [BK94], one has enough
to break the encryption almost instantly.
Without the original file, all is not lost; we have
the file’s type as indicated by its extension, and we have its size. We have at least one byte of known plaintext
used for filtering incorrect passwords.
Zippers also encrypt output from a pseudorandom number generator that
may be vulnerable to attack.
It is the author’s opinion that the only reason the
PKZIP cipher has held up so well in light of [BK94] is due to the deflate
algorithm and the difficulty of getting enough plaintext. This paper treats the question of how far we
can reduce the plaintext requirement and still break the cipher in a reasonable
amount of time.
1.1 The PKZIP Stream Cipher
The
PKZIP stream cipher was designed by Roger Schaffely and is fully described in
the file APPNOTE.TXT found in most PKZIP distributions. The internal state of the cipher consists of
three 32-bit words: key0, key1, and key2. These values are initialized to 0x12345678,
0x23456789, and 0x34567890, respectively.
The internal state is updated by mixing in the next plaintext byte; we
follow [BK94] in our description, combining the decrypt_byte() and update_keys()
functions described in APPNOTE.TXT into a single function:
unsigned char PKZIP_stream_byte (unsigned char
pt)
{
unsigned
short temp;
key0
= crc32 (key0, pt);
key1
= (key1+(key0 & 0xFF)) * 0x08088405 + 1;
key2
= crc32 (key2, key1 >> 24);
temp
= (key2 & 0xFFFC) | 2;
return
( (temp * (temp ^ 1)) >> 8)&0xFF;
}
where ^ is XOR, | is OR, & is AND, and
>> is right shift.
For the purposes
of this paper, we define crc32() to be
#define crc32(crc, b) ((crc >> 8) ^
crctab[(crc & 0xFF) ^ b])
The old CRC state is shifted right eight
bits and XORed with the 32-bit entry of a byte-indexed table to produce the new
state. The index is the low byte of the
old state XORed with b. The function is
linear; that is,
crctab[x ^ y] =
crctab[x] ^ crctab[y].
The cipher is
keyed by encrypting the user’s password and throwing away the corresponding
stream bytes. The stream bytes produced after this point are XORed with the
plaintext to produce the ciphertext.
The crux of all of
our attacks is the fact that there is almost no diffusion in the internal state. Of the ninety-six bits of internal state,
eight bits of key0 affect key1; eight bits of key1 affect key2;
and fourteen bits of key2 affect the output stream byte.
1.2
Encrypted File Format
Zippers must prepend twelve bytes to the
beginning of the file to be encrypted.
The ZIP file format specifies that the first eleven should be random and
that the last should be the low byte of the archived file’s CRC. The entire CRC is stored in plaintext, and
this byte serves as a password filter.
Some zippers, like InfoZIP [IZ] and WinZip [WZ], store ten random bytes
and the low two bytes of the CRC.
We assume that deflate
was the algorithm used to compress the underlying data. The author did a crude
statistical test on a few hundred files of varying types and sizes and found
that given the file type, as indicated by its extension, and its size, one can
guess about the first two and a half bytes of the compressed file. Since the checksum bytes are at the very end
of the prepended header, we can use them to augment the plaintext from the file
in mounting our attacks.
2 Biham and Kocher’s
Attack
For completeness, we review [BK94]’s
results. We begin with some
terminology. Bits are numbered from
right to left: bit 0 is the ones’ place, bit 1 is the twos’ place, bit 2 is the
fours’ place, etc. Let pi
be the ith known-plaintext byte, i = 1, 2, 3, ... Let si be the ith
stream byte. Let key0i,
key1i, and key2i be the value of key0,
key1, and key2 after processing pi. Note that s1 is a function
of the random header and the password; it is independent of the plaintext. In general, bits 2 through 15 of key2i
determine si+1.
Their attack
proceeds as follows:
XOR
ciphertext and known plaintext to get known stream bytes s1
through s13.
Guess
22 bits of key213.
This
guess combined with s13 is enough to fill in eight more bits
of key213, for a total of thirty. s12 provides enough information to derive 30
bits of key212 and the most significant of key113. In general, each stream byte si allows
us to calculate thirty bits of key2i-1 and the most
significant byte of key1i.
We
continue using stream bytes to make a list of the most significant bytes of key113
through key18.
For
each list, we find 216 possibilities for the low 24 bits of key113
through key19 by calculating the low byte of (key1i+LSB(key0i+1))
such that you get the right high byte of key1i+1.
From
each of the 216 lists of complete key1’s, derive the low
bytes of key013 through key010.
Once
we have the low bytes of key010, key011, key012,
and key013, we can use our knowledge of the plaintext bytes
to invert the CRC function, since it’s linear, and find the complete internal
state at one point along the encryption.
Once
we have the complete internal state, we can decrypt backwards as far as we
want; we decrypt the ciphertext corresponding to p1 through p5
and filter out wrong keys.
We can break a
file with work equivalent to encrypting around 238 bytes and
negligible memory. We need a total of
thirteen bytes of known plaintext: eight for the attack, and five to filter the
238 keys that remain.
2.1
Minor Improvement in the Amount of Plaintext Required
[BK94] throws away six bits in key17. By using them, we can reduce the plaintext
requirement to twelve bytes at the cost of increasing the work factor by four.
2.2
More Files in the Archive
If we have more than one file in the
archive, we can make the reasonable assumption that they were encrypted with
the same password. “Zippers” encrypt at
least one check byte into every encrypted file to verify that the user entered
the correct password. Once we have the
complete internal state of the cipher, we can run it backwards to the beginning
of the file and read out key0, key1, and key2. Since this state is the same at the
beginning of each file (it only depends on the password), we can decrypt the
check byte in each file and use it to filter with instead of known plaintext
from a single file. This also works if
the files are in different archives, but have the same password.
Since the attack
is so fast, we can afford to guess a couple of stream bytes. If we guess N of them, we get 240 + 8N
keys to filter; this increases the amount of work and the number of additional
files we need, but reduces the amount of known plaintext required from a single
file.
Consider the N=1
case: if the file was created in a zipper with two checksum bytes, we can break
the file with work equivalent to encrypting 248 bytes. We need two checksum bytes followed by only
four more known plaintext bytes in one file, and three other files in the
archive (six check bytes) to filter the 248 possible keys.
If there is only
one checksum byte, we can break the file with the same amount of work, but we
need seven files in the archive and five bytes of known plaintext in addition
to the checksum byte.
3 Divide and Conquer
The limited diffusion of the internal state
prompts us to ask how much of the state we need to guess to process one
byte. If it is small enough, we can
guess it and filter out keys that won’t work with our known stream bytes, then
proceed to the next part.
It turns out that
we can get by with as few as 23 bits.
Note that we don’t need to guess 16 bits of key00 to
calculate the low byte of key01: if we distribute the XOR in
the definition of crc32(), we see that we only need to guess 8 bits of crc32(key00,
0):
LSB(key01) = LSB(crc32(key00,
p1))
= LSB(crc32(key00, 0)) ^ LSB(crctab[p1]).
Now we distribute
the multiplication across the addition in the next step:
MSB(key11) = MSB( (key10
+ LSB(key01)) * 0x08088405 + 1 )
(A) = MSB(LSB(key01) *
0x08088405) +
(B)
MSB(key10 * 0x08088405) + possible carry bit.
We separate the
equation into parts we know (A) and parts we need to guess (B), and find we
need to guess nine bits, including a possible carry bit. Note that since we know the low bits of (LSB(key01)
* 0x08088405), the carry bit will
usually give us more than one bit of information in the form of an upper or
lower bound on the rest of (key10 * 0x08088405) that we
haven’t guessed yet.
Given a stream
byte si+1, we can find sixty four values for bits 2..15 of key2i. It’s easy to see why: fourteen bits of key2i
produce eight bits of si+1, so there are six left over. We can create a table of 256 x 64 bytes such
that given si+1 and bits 10..15 of key2i,
we can look up bits 2..9 of key2i. We call this the preimage table.
We guess bits
10..15 of crc32(key20, 0) and use s2, the
preimage table, and crctab[MSB(key11)] to find bits
2..9 of crc32(key20, 0). We end up with 223 key guesses.
To find the next
part of the internal state, we have to guess about the same amount. This guess
is not illustrated, but we basically guess about eight more bits of information
in each of the three keys. The only complicated
part is separating what we know about key1 from what we don’t.
We guess bits
8..15 of crc32(key00, 0) directly; the next guess
involving key1 is a little more complicated:
MSB(key12) = MSB( (key11 + LSB(key02))
* 0x08088405 + 1 )
= MSB( LSB(key02)
* 0x08088405 ) +
MSB( key11
* 0x08088405 ) + possible carry bit
= MSB( LSB(key02)
* 0x08088405 ) +
MSB( (key10
+ LSB(key01)) * 0xD4652819 ) +
possible carry bit
(A) = MSB( LSB(key02) * 0x08088405 ) +
(A) MSB( LSB(key01)
* 0xD4652819 ) +
(B) MSB( key10 *
0xD4652819 ) + possible carry bit.
Again, (A) is
known and we have to guess (B) nine bits, including a possible carry bit. The carry bit establishes an upper or lower
bound on (key10 * 0xD4652819). We end this filter by guessing bits 16..23 and bits 0..1 of crc32(key20,
0) and calculating a stream byte.
We have guessed 27 more bits, but the output byte has to match s3,
so we expect 223+27-8 = 242 key guesses to pass this
filter.
At this point, we
have guessed 24 bits of crc32(key20, 0) and we know s1. From this we can calculate, on average, one
full value of key20.
There are also only around 213 possibilities for key10
due to the restrictions from the carry bits.
So the third stage consists of guessing bits 16..23 of crc32(key00,
0) and running through the 213 possible values for key10. We expect 242+13+8-8 = 255
key pieces to pass this filter.
Finally, we guess
the last eight bits of key00 and we have a complete internal
state. We will have 263
complete keys to filter with other bytes, whether they are in the archived file
or in checksum bytes in other files.
The cost is approximately the same as encrypting 263 bytes
under the stream cipher. The plaintext
requirement is four bytes total; at least one of these may come from the file’s
own check byte(s).
This is 128 times
faster than guessing three stream bytes and using [BK94].
4 Random Number
Generation
InfoZIP is a cross-platform freeware zipper
distribution. Because the C source code
is readily available and is free, it forms the basis of most non-PKZIP zippers,
including the very popular WinZip and NetZip*.
APPNOTE.TXT does
not specify how to generate the prepended random bytes; it only says that they
are used to scramble the internal state of the cipher and are discarded after
decryption. InfoZIP implements it as
follows:
srand(time(NULL) ^
getpid())
For
each file in the order they are stored,
Generate
ten random bytes by calling rand() ten times and discarding all but the high
eight bits of each return value.
Initialize
the cipher with the password.
Encrypt
the ten random bytes.
Append
the low two bytes of the checksum.
Reinitialize
the cipher with the password and encrypt the twelve-byte header and the
compressed file.
rand()
is usually implemented as a truncated linear congruential generator. WinZip and NetZip use Microsoft Visual C++’s
implementation, which has a 31-bit seed:
unsigned long seed;
void srand( unsigned long s ) { seed = s; }
unsigned short rand()
{
seed = 0x343FD * seed + 0x269EC3;
return ( ( seed >> 16) & 0x7FFF );
}
Let ri,
j be the jth random byte in the ith archived file; i,
j = 1, 2, 3, ... Note that the internal state of the cipher is the same
both times ri, 1 is encrypted. Since XOR is its own inverse, ri, 1 is decrypted
for all i. Also, every r1,
j reveals the high eight bits of the internal state of the random number
generator.
Since rand()
is linear, we can compute two new constants for a generator such that it
outputs every tenth output of the original.
We know the upper eight bits of the generator, so we guess the low 23
bits and start generating every tenth output and comparing them to the leaked
bytes. Four archived files suffice to
uniquely determine the seed that was used in the random number generator, and therefore
every ri,j.
Let us emphasize
that we do not have known plaintext at this point, in the sense that
[BK94] requires, because the plaintext was encrypted twice, and we do not know
the actual values of the stream bytes.
What we can derive is the XOR of the stream bytes in the first
and second encryption.
5 Parallel Divide and
Conquer
We can adapt the divide-and-conquer
algorithm from section 3 to use this information. Once we know the “random” headers, we can exploit the fact that
the internal state was the same at the beginning of each embedded file and
filter guesses with multiple known plaintext bytes in parallel, instead of
being restricted to one byte as in section 3.
Let si,j,k
be the jth stream byte of the kth encryption of the bytes in the ith
archived file; i, j = 1, 2, 3, ...; k = 1, 2. We guess the same 23 bits as in section 3,
but since we don’t know the actual value of s1,2,1, we have
to guess it, too. It is equivalent, and
more convenient, to guess bits 2..9 of crc32(key20, 0). Now we have a prediction for s1,2,1,
and can derive s1,2,2.
We don’t have any information at all about si,1,1,
since it’s the same as si,1,2 and cancels out. We guess it, and check to see that the
second encryption spits out s1,2,2. We have to guess a carry bit for the second encryption, too, so
of the 223+8+8+1 = 240 key guesses, we expect 240-8
= 232 key pieces to pass this filter on this file.
We want to filter
out all but the correct guess at this stage; fortunately, we know that the
state we are trying to guess was the same at the start of each encryption. We have an eight-bit value to filter with in
each file, si,2,1 XOR si,2,2, but we also guess
two carry bits, so with four more files in the archive, we can reduce the
number of false positives to around 232 - 6*4 = 256. Note that we now have ten carry bits
putting restrictions on key10 instead of just one.
We continue to the
next byte of each file. This time we
guess the same 26 bits as in section 3 plus two carry bits, one for each
encryption. With five files, we have 30
bits to filter with. We expect that 28+26+2-30
= 26 = 64 key guesses survive the second stage. Total work for this stage is 28+26+2
= 236 byte encryptions.
At this point we
can derive key20 as before.
Due to all the carry bit restrictions, we only have on the order of 28
possible key10’s. We
guess eight more bits of crc32(key10, 0) and run
through all the remaining key10’s. Since we aren’t guessing carry bits any more, we have 40 bits to
filter with. 26+16-40 <
1, and we expect that only the correct guess survives. Finally, we guess the last eight bits of crc32(key10,
0) and only the correct guess survives.
Experimentally, we
have found that a key guess passes the second stage only if our guess for si,1,1
is correct. This usually occurs about
one quarter of the way through the first 40-bit keyspace. After that, we only try one value for si,1,1
instead of 256 and the rest of the attack takes at most a few minutes.
The work done in
the first stage dwarfs the rest of the work needed. The total work is therefore about the same as encrypting 239
bytes. Cracking a file created with this
kind of weak PRNG usually takes about two hours on a 500 MHz Pentium II. One can then take the three keys and use
[BK94]’s second algorithm to derive a password, if one desires, although the
three keys suffice to decrypt the files.
6 Conclusion
The PKZIP stream cipher is very weak. The deflate algorithm makes it harder
to get plaintext, but in most cases we can reduce the plaintext requirement to
the point where one can guess enough plaintext based on file type and size
alone. The most popular zippers on the
internet are also susceptible to an attack that runs in two hours on a single
PC based on known plaintext provided by the application and independent of the
archived files themselves.
Bibliography
[BK94]
Biham, Eli and Paul Kocher. “A Known Plaintext Attack on the PKZIP Stream
Cipher.” Fast Software Encryption 2, Proceedings of the Leuven Workshop, LNCS
1008, December 1994.
[DL]
http://download.cnet.com/downloads/0,10151,0-10097-
106-0-1-5,00.html?tag=st.dl.10097_106_1.lst.lst&
[IZ] ftp://ftp.freesoftware.com/pub/infozip/
[K92] Kocher, Paul. ZIPCRACK 2.00
Documentation. 1992.
http://www.bokler.com/bokler/zipcrack.txt
[PGP98] User’s Guide, Version 6.0. Network Associates, Inc., 1998. p.145.
http://www.nai.com
[WZ] http://www.winzip.com