The “Encryption” section of the PDF Reference (section 3.5) mentions that when the encryption dictionary entry with a key of /V has a value of 3, then document de/encryption is via “an unpublished algorithm that permits encryption key lengths ranging from 40 to 128 bits.” As far as I can tell, this algorithm is in fact unpublished – by anyone. The closest I could find was a reference to it in one of Dmitri Sklyarov’s 2001 DEFCON slides. Yeah, that Sklyarov, those DEFCON slides. Maybe he described the whole algorithm in his talk, but the DEFCON A/V archives for that year seem to be down. So I sighed, put on my reversing cap, and figured it out.
The standard object-key-derivation algorithm (section 3.5.1, “General Encryption Algorithm”) accepts as inputs the file encryption key, the object number, and the generation number, and produces as out put a key for a symmetric cipher. The “unpublished” algorithm accepts the same inputs and also produces a symmetric cipher key. It presumably could be used with either RC4 or AES as documented for /V values of 1 and 2, although I’ve so far only seen RC4 used.
The unpublished algorithm in use when /V is 3 is as follows (mimicking algorithm 3.5.1):
1. Obtain the object number and generation number from the object identifier of the string or stream to be encrypted. If the string is a direct object, use the identifier of the indirect object containing it. Substitute the object number with the result of exclusive-or-ing it with the hexadecimal value 0x3569AC. Substitute the generation number with the result of exclusive-or-ing it with the hexadecimal value 0xCA96.
2. Treating the substituted object and generation numbers as binary integers, extend the original n-byte encryption key to n + 5 bytes by appending the low-order byte of the object number, the low-order byte of the generation number, the second-lowest byte of the object number, the second-lowest byte of the generation number, and third-lowest byte of the object number in that order, low-order byte first. Extend the encryption key an additional 4 bytes by adding the value "sAlT", which corresponds to the hexadecimal values 0x73, 0x41, 0x6C, 0x54.
3. Initialize the MD5 hash function and pass the result of step 2 as input to this function.
4. Use the first (n + 5) bytes, up to a maximum of 16, of the output from the MD5 hash as the key for the symmetric-key algorithm, along with the string or stream data to be encrypted.
Now hopefully Google will be kind enough to index this in a way that lets other people find it.