Monday, February 23, 2009

Adobe PDF V=3 Encryption

The “Encryption” section of the PDF Reference (section 3.5) mentions that when the encryption dictionary entry with a key of /V has a value of 3, then document de/encryption is via “an unpublished algorithm that permits encryption key lengths ranging from 40 to 128 bits.” As far as I can tell, this algorithm is in fact unpublished – by anyone. The closest I could find was a reference to it in one of Dmitri Sklyarov’s 2001 DEFCON slides. Yeah, that Sklyarov, those DEFCON slides. Maybe he described the whole algorithm in his talk, but the DEFCON A/V archives for that year seem to be down. So I sighed, put on my reversing cap, and figured it out.

The standard object-key-derivation algorithm (section 3.5.1, “General Encryption Algorithm”) accepts as inputs the file encryption key, the object number, and the generation number, and produces as out put a key for a symmetric cipher. The “unpublished” algorithm accepts the same inputs and also produces a symmetric cipher key. It presumably could be used with either RC4 or AES as documented for /V values of 1 and 2, although I’ve so far only seen RC4 used.

The unpublished algorithm in use when /V is 3 is as follows (mimicking algorithm 3.5.1):

1. Obtain the object number and generation number from the object identifier of the string or stream to be encrypted. If the string is a direct object, use the identifier of the indirect object containing it. Substitute the object number with the result of exclusive-or-ing it with the hexadecimal value 0x3569AC. Substitute the generation number with the result of exclusive-or-ing it with the hexadecimal value 0xCA96.

2. Treating the substituted object and generation numbers as binary integers, extend the original n-byte encryption key to n + 5 bytes by appending the low-order byte of the object number, the low-order byte of the generation number, the second-lowest byte of the object number, the second-lowest byte of the generation number, and third-lowest byte of the object number in that order, low-order byte first. Extend the encryption key an additional 4 bytes by adding the value "sAlT", which corresponds to the hexadecimal values 0x73, 0x41, 0x6C, 0x54.

3. Initialize the MD5 hash function and pass the result of step 2 as input to this function.

4. Use the first (n + 5) bytes, up to a maximum of 16, of the output from the MD5 hash as the key for the symmetric-key algorithm, along with the string or stream data to be encrypted.

Now hopefully Google will be kind enough to index this in a way that lets other people find it.

Wednesday, February 18, 2009

Circumventing Adobe ADEPT DRM for EPUB

By way of a concrete reverse-engineering contribution, I have successfully circumvented Adobe's ADEPT DRM scheme for EPUB files. The same circumvention probably also allows decryption of ADEPT-encrypted PDF files, although I haven't looked into it yet.

ADEPT is pretty close to faultless as a crypto system -- a per-user RSA key encrypts a per-book AES key which encrypts the content. It uses AES in CBC mode with a random IV. It uses RSA with PKCS#1 v1.5 padding, which is perfectly adequate for this case. Unfortunately for Adobe, this isn't a crypto system, but a DRM system. DRM systems ultimately depend not on the strength of their cryptography, but the complexity of their obfuscation. There is very little obfuscation in how Adobe Digital Editions hides and encrypts the per-user RSA key, allowing fairly simple duplication of exactly the same process Digital Editions uses to retrieve it.

In practical terms, this breaks ADEPT circumvention into two components: key retrieval and decryption. Key retrieval depends only on the details of Digital Editions and can change seamlessly with an update to the same. Decryption however is a property of the architecture of the system as a whole. Preventing circumventing decryption with previously retrieved keys would require changes to both DE and Adobe Content Server and would take quite some time to propagate to all ACS customers. The upshot being that if you want to decrypt ADEPT books in the future, grab your key now -- no garauntees that you'll be able to do so in the future, but a previously-retrieved key should keep on working.

Here are the scripts:

Key-retrieval script: ineptkey (version 5)

Decryption script: ineptepub (version 5.2)

To use, install Python 2.6 (and on Windows PyCrypto), run the key-retrieval script, then run the decryption script using the retrieved key.

And on a preachy note, please don't be a jerk with these. DRM is bad, but piracy is wrong kids, and only validates the opinions of those who think they need DRM in the first place.

Edit:
script links will change reflect dropped pastebins and new versions.

Tuesday, February 17, 2009

“Not for Distribution”

Don’t you love it when documents marked “Not for Distribution” and “<Company> Confidential Information” end up indexed by Google? Even better is when those documents are served up by the company in question. And so for now we have the following for Adobe Content Server 4: the Quick Start Guide, User Manual, and Technical Reference Manual. Mostly sausage factory stuff, but there's some helpful-looking info on ADEPT.

And we are live in 3... 2...

This is still another reverse engineering blog. Watch this space for updates...