I’ve managed to figure out the Barnes & Noble user EPUB AES key derivation algorithm. Rather than fragment things too much, I’ve just edited my previous post on B&N EPUBs and added the (for any platform) key-derivation script.
The algorithm is interesting mostly in that it isn’t very good. I mentioned in my post analyzing Adobe’s ADEPT system that ADEPT is a fairly well-designed cryptographic system, even if not a very effective DRM system. Adobe used all standard algorithms applied in standards ways and glued together with standard plumbing. Barnes and Noble’s key derivation algorithm in contrast is neither effective at increasing obfuscation nor shows much awareness of cryptographic standards.
Cryptography is hard. Do you know how different configurations of your chosen algorithms can make them more or less resistant to side-channel attacks? Do you understand why RSA Laboratories switched the recommended RSA padding scheme in PKCS#1 to OAEP? Do you know the implications of different block cipher modes of operation and why or why not you should choose a mode which includes integrity protection? These questions represent a random sampler platter of things you should just know in your bones before you do anything close to designing a cryptosystem. For everyone else – a category in which I include myself – there are well-reviewed, well-tested algorithms presented in standards documents published by such organizations as RSA Laboratories and the IETF.
For key derivation from non-random source data, there is in particular the PBKDF2 (Password-Based Key Derivation Function) published in PKCS#5. My point is not that the key-derivation algorithm used by B&N necessarily has cryptographically undesirable properties (although its computation speed makes brute-force attacks more plausible), but that the fact that they rolled their own instead of reaching for PBKDF2 shows a marked lack of cryptographic savvy.
On the DRM-as-obfuscation end this algorithm similarly come up short. The multiple steps involved demonstrate some level of attempt to introduce complexity, but in the resulting binary all those steps are in one place. The entire algorithm boils down to one function calling a handful of easily-labeled subroutines (sha1_wrapper, normalize_name, aes_encrypt, etc).
I really am starting to think that e-book format providers actively want their DRM schemes to be broken.
Tuesday, December 22, 2009
Sunday, December 20, 2009
Circumventing Barnes & Noble DRM for EPUB
In a move sure to leave consumers scratching their heads (especially the ones already wondering why they bought a Nook), Barnes & Noble has decided to implement their own DRM scheme for EPUB books. They partnered with Adobe to do it (it’s a variant of their ADEPT scheme), and all the Adobe SDK users will get access to it eventually. But for now much hilarity ensues as consumers buy books they can’t read on their devices.
The basic idea behind the B&N EPUB scheme is the same as that used by the ill-fated eReader format B&N acquired – step 1: generate an encryption key from the book-purchaser’s name + credit card #; step 2: hope that they don’t like giving that information out to strangers. They like to call this a form of “social DRM,” although I believe a more appropriate term is “silly.”
It would be very bad form for an application to keep user CC#s just sitting around on disk, so the Windows version of the Barnes & Noble Desktop Reader application (BDReader) just holds on to the generated key and not the source info. A wise decision, for which I congratulate them. It then stores this key in plain text in a sqlite3 database. An... interesting... decision, for which I thank them. Update: and then it turned out the key-generation algorithm was pretty easy too...
So now three scripts:
A Windows-only key-retrieval script: ignoblekey (version 2)
An any-platform key-generation script: ignoblekeygen (version 1)
And an any-platform book-decryption script: ignobleepub (version 1)
You need the decryption script and one of either the key-retrieval or key-generation scripts. They produce the same output, and the key-generator works on any platform, but I released the Windows key-retrieval script first and will leave it up for Windows users who’d rather not give their credit card number to random programs they download off the Internet (despite being a source-visible script and all).
For good only, please.
The basic idea behind the B&N EPUB scheme is the same as that used by the ill-fated eReader format B&N acquired – step 1: generate an encryption key from the book-purchaser’s name + credit card #; step 2: hope that they don’t like giving that information out to strangers. They like to call this a form of “social DRM,” although I believe a more appropriate term is “silly.”
It would be very bad form for an application to keep user CC#s just sitting around on disk, so the Windows version of the Barnes & Noble Desktop Reader application (BDReader) just holds on to the generated key and not the source info. A wise decision, for which I congratulate them. It then stores this key in plain text in a sqlite3 database. An... interesting... decision, for which I thank them. Update: and then it turned out the key-generation algorithm was pretty easy too...
So now three scripts:
A Windows-only key-retrieval script: ignoblekey (version 2)
An any-platform key-generation script: ignoblekeygen (version 1)
And an any-platform book-decryption script: ignobleepub (version 1)
You need the decryption script and one of either the key-retrieval or key-generation scripts. They produce the same output, and the key-generator works on any platform, but I released the Windows key-retrieval script first and will leave it up for Windows users who’d rather not give their credit card number to random programs they download off the Internet (despite being a source-visible script and all).
For good only, please.
Thursday, December 17, 2009
Circumventing Kindle For PC DRM (updated)
Amazon actually put a bit effort behind the DRM obfuscation in their Kindle for PC application (K4PC). The Kindle proper and Kindle for iPhone/iPod app both use a single "device" encryption key for all DRMed content. K4PC uses the same encryption algorithms, but ups the ante with a per-book session key for the actual en/decryption. And they seem to have done a reasonable job on the obfuscation. Way to go Amazon! It's good enough that I got bored unwinding it all and just got lazy with the Windows debugging APIs instead.
So here you go: unswindle v7 (previous versions: v6 v5 v4 v3).
You'll also need a copy of darkreverser's mobidedrm (check the most recent comments for the newest links).
Put those kids together (in the same directory) and run unswindle.pyw. It launches KindleForPC.exe. Pick the book you want to decrypt. Close KindleForPC. Pick your output file. And enjoy the sweet taste of freedom.
Script name in honor of rms and The Right to Read. Don't use this to steal, or I'm taking my toys and going home.
Updates. It came to my attention that unswindle version 1 did not work if KindleForPC was installed as a non-administrator and did not work on versions of Windows other than XP. Version 2 should fix these issues. Version 3 fixes an intermittent path-getting issue. Version 4 fixes an exception related to opening thread handles, detect Topaz format books, and detects that you have the proper version of Kindle For PC installed. Version 5 works with the new (20091222) version of the K4PC executable. Version 6 cleanly handles already DRM-free files.
Update 2009-12-22. Amazon has demonstrated that they (unlike Adobe) take their DRM seriously: they've already pushed out a new version of K4PC which breaks this particular script. As you can clearly see via their SHA-1 hashes:
Update 2009-12-22 (2). The K4PC update may not actually have been targeted at unswindle, as Amazon seems to have done nothing in particular to make the basic approach more difficult. In any case, I've updated unswindle to handle the 20091222 version of the executable. We'll see if Amazon throws out another new build in short order, and I'll put some more elbow grease into figuring out the PID-generation algorithm.
So here you go: unswindle v7 (previous versions: v6 v5 v4 v3).
You'll also need a copy of darkreverser's mobidedrm (check the most recent comments for the newest links).
Put those kids together (in the same directory) and run unswindle.pyw. It launches KindleForPC.exe. Pick the book you want to decrypt. Close KindleForPC. Pick your output file. And enjoy the sweet taste of freedom.
Script name in honor of rms and The Right to Read. Don't use this to steal, or I'm taking my toys and going home.
Updates. It came to my attention that unswindle version 1 did not work if KindleForPC was installed as a non-administrator and did not work on versions of Windows other than XP. Version 2 should fix these issues. Version 3 fixes an intermittent path-getting issue. Version 4 fixes an exception related to opening thread handles, detect Topaz format books, and detects that you have the proper version of Kindle For PC installed. Version 5 works with the new (20091222) version of the K4PC executable. Version 6 cleanly handles already DRM-free files.
Update 2009-12-22. Amazon has demonstrated that they (unlike Adobe) take their DRM seriously: they've already pushed out a new version of K4PC which breaks this particular script. As you can clearly see via their SHA-1 hashes:
fd386003520f7af7a15d77fcc2b859dd53e44bc1 KindleForPC-installer-20091217.exeThe application doesn't seem to auto-update, so if you can find a copy of the original installer you should be fine. Otherwise you'll have to hang tight. Newest unswindle version detects if you have the wrong K4PC executable installed.
13a816a3abf7a71e7b6a55228099b03b1dc3789b KindleForPC-installer-20091222.exe
Update 2009-12-22 (2). The K4PC update may not actually have been targeted at unswindle, as Amazon seems to have done nothing in particular to make the basic approach more difficult. In any case, I've updated unswindle to handle the 20091222 version of the executable. We'll see if Amazon throws out another new build in short order, and I'll put some more elbow grease into figuring out the PID-generation algorithm.
Wednesday, March 4, 2009
No Free Speech for You
Well, that didn't take very long:
In any case, I'll be arranging a different way of hosting these tools.
Hello,Any guesses as to why only the PDF decryption tool and not the EPUB tool? Especially what with the actual break and all being extracting the key. Given the key and the unpublished V=3 PDF key-generation algorithm, PDF decryption is a simple matter of programming to a spec.
Blogger has been notified, according to the terms of the Digital Millennium Copyright Act (DMCA), that content in your blog:
i-u2665-cabbages.blogspot.com
allegedly infringes upon the copyrights of others. The content in question is located in the following posts:
http://i-u2665-cabbages.blogspot.com/2009/02/ circumventing-adobe-adept-drm- for-pdf.html
The notice that we received, with any personally identifying information removed, will be posted online by a service called Chilling Effects at http://www.chillingeffects.org/notice.cgi?sID=9961 . We do this in accordance with the Digital Millennium Copyright Act (DMCA). Please note that it may take Chilling Effects up to several weeks to post the notice online at the link provided.
The DMCA is a United States copyright law that provides guidelines for online service provider liability in case of copyright infringement. Please see http://www.educause.edu/Browse/645?PARENT_ID=254 for more information about the DMCA, and see http://www.google.com/blogger_dmca.html for the process that Blogger requires in order to make a DMCA complaint.
We are asking that you please remove the allegedly infringing content in your blog. If you do not do this within the next 3 days (by 3/5/09), we will be forced to remove the posts in question. If we did not do so, we would be subject to a claim of copyright infringement, regardless of its merits.
We can reinstate this content into your blog upon receipt of a counter notification pursuant to sections 512(g)(2) and (3) of the DMCA. For more information about the requirements of a counter
notification and a link to a sample counter notification, see http://www.google.com/blogger_dmca.html#counter .
Please note that repeated violations to our Terms of Service may result in further remedial action taken against your Blogger account.
If you have legal questions about this notification, you should retain your own legal counsel. If you have any other questions about this notification, please let us know.
Thank you for your understanding.
Sincerely,
The Blogger Team
In any case, I'll be arranging a different way of hosting these tools.
Monday, February 23, 2009
Adobe PDF V=3 Encryption
The “Encryption” section of the PDF Reference (section 3.5) mentions that when the encryption dictionary entry with a key of /V has a value of 3, then document de/encryption is via “an unpublished algorithm that permits encryption key lengths ranging from 40 to 128 bits.” As far as I can tell, this algorithm is in fact unpublished – by anyone. The closest I could find was a reference to it in one of Dmitri Sklyarov’s 2001 DEFCON slides. Yeah, that Sklyarov, those DEFCON slides. Maybe he described the whole algorithm in his talk, but the DEFCON A/V archives for that year seem to be down. So I sighed, put on my reversing cap, and figured it out.
The standard object-key-derivation algorithm (section 3.5.1, “General Encryption Algorithm”) accepts as inputs the file encryption key, the object number, and the generation number, and produces as out put a key for a symmetric cipher. The “unpublished” algorithm accepts the same inputs and also produces a symmetric cipher key. It presumably could be used with either RC4 or AES as documented for /V values of 1 and 2, although I’ve so far only seen RC4 used.
The unpublished algorithm in use when /V is 3 is as follows (mimicking algorithm 3.5.1):
1. Obtain the object number and generation number from the object identifier of the string or stream to be encrypted. If the string is a direct object, use the identifier of the indirect object containing it. Substitute the object number with the result of exclusive-or-ing it with the hexadecimal value 0x3569AC. Substitute the generation number with the result of exclusive-or-ing it with the hexadecimal value 0xCA96.
2. Treating the substituted object and generation numbers as binary integers, extend the original n-byte encryption key to n + 5 bytes by appending the low-order byte of the object number, the low-order byte of the generation number, the second-lowest byte of the object number, the second-lowest byte of the generation number, and third-lowest byte of the object number in that order, low-order byte first. Extend the encryption key an additional 4 bytes by adding the value "sAlT", which corresponds to the hexadecimal values 0x73, 0x41, 0x6C, 0x54.
3. Initialize the MD5 hash function and pass the result of step 2 as input to this function.
4. Use the first (n + 5) bytes, up to a maximum of 16, of the output from the MD5 hash as the key for the symmetric-key algorithm, along with the string or stream data to be encrypted.
Now hopefully Google will be kind enough to index this in a way that lets other people find it.
The standard object-key-derivation algorithm (section 3.5.1, “General Encryption Algorithm”) accepts as inputs the file encryption key, the object number, and the generation number, and produces as out put a key for a symmetric cipher. The “unpublished” algorithm accepts the same inputs and also produces a symmetric cipher key. It presumably could be used with either RC4 or AES as documented for /V values of 1 and 2, although I’ve so far only seen RC4 used.
The unpublished algorithm in use when /V is 3 is as follows (mimicking algorithm 3.5.1):
1. Obtain the object number and generation number from the object identifier of the string or stream to be encrypted. If the string is a direct object, use the identifier of the indirect object containing it. Substitute the object number with the result of exclusive-or-ing it with the hexadecimal value 0x3569AC. Substitute the generation number with the result of exclusive-or-ing it with the hexadecimal value 0xCA96.
2. Treating the substituted object and generation numbers as binary integers, extend the original n-byte encryption key to n + 5 bytes by appending the low-order byte of the object number, the low-order byte of the generation number, the second-lowest byte of the object number, the second-lowest byte of the generation number, and third-lowest byte of the object number in that order, low-order byte first. Extend the encryption key an additional 4 bytes by adding the value "sAlT", which corresponds to the hexadecimal values 0x73, 0x41, 0x6C, 0x54.
3. Initialize the MD5 hash function and pass the result of step 2 as input to this function.
4. Use the first (n + 5) bytes, up to a maximum of 16, of the output from the MD5 hash as the key for the symmetric-key algorithm, along with the string or stream data to be encrypted.
Now hopefully Google will be kind enough to index this in a way that lets other people find it.
Wednesday, February 18, 2009
Circumventing Adobe ADEPT DRM for EPUB
By way of a concrete reverse-engineering contribution, I have successfully circumvented Adobe's ADEPT DRM scheme for EPUB files. The same circumvention probably also allows decryption of ADEPT-encrypted PDF files, although I haven't looked into it yet.
ADEPT is pretty close to faultless as a crypto system -- a per-user RSA key encrypts a per-book AES key which encrypts the content. It uses AES in CBC mode with a random IV. It uses RSA with PKCS#1 v1.5 padding, which is perfectly adequate for this case. Unfortunately for Adobe, this isn't a crypto system, but a DRM system. DRM systems ultimately depend not on the strength of their cryptography, but the complexity of their obfuscation. There is very little obfuscation in how Adobe Digital Editions hides and encrypts the per-user RSA key, allowing fairly simple duplication of exactly the same process Digital Editions uses to retrieve it.
In practical terms, this breaks ADEPT circumvention into two components: key retrieval and decryption. Key retrieval depends only on the details of Digital Editions and can change seamlessly with an update to the same. Decryption however is a property of the architecture of the system as a whole. Preventing circumventing decryption with previously retrieved keys would require changes to both DE and Adobe Content Server and would take quite some time to propagate to all ACS customers. The upshot being that if you want to decrypt ADEPT books in the future, grab your key now -- no garauntees that you'll be able to do so in the future, but a previously-retrieved key should keep on working.
Here are the scripts:
Key-retrieval script: ineptkey (version 5)
Decryption script: ineptepub (version 5.2)
To use, install Python 2.6 (and on Windows PyCrypto), run the key-retrieval script, then run the decryption script using the retrieved key.
And on a preachy note, please don't be a jerk with these. DRM is bad, but piracy is wrong kids, and only validates the opinions of those who think they need DRM in the first place.
Edit: script links will change reflect dropped pastebins and new versions.
ADEPT is pretty close to faultless as a crypto system -- a per-user RSA key encrypts a per-book AES key which encrypts the content. It uses AES in CBC mode with a random IV. It uses RSA with PKCS#1 v1.5 padding, which is perfectly adequate for this case. Unfortunately for Adobe, this isn't a crypto system, but a DRM system. DRM systems ultimately depend not on the strength of their cryptography, but the complexity of their obfuscation. There is very little obfuscation in how Adobe Digital Editions hides and encrypts the per-user RSA key, allowing fairly simple duplication of exactly the same process Digital Editions uses to retrieve it.
In practical terms, this breaks ADEPT circumvention into two components: key retrieval and decryption. Key retrieval depends only on the details of Digital Editions and can change seamlessly with an update to the same. Decryption however is a property of the architecture of the system as a whole. Preventing circumventing decryption with previously retrieved keys would require changes to both DE and Adobe Content Server and would take quite some time to propagate to all ACS customers. The upshot being that if you want to decrypt ADEPT books in the future, grab your key now -- no garauntees that you'll be able to do so in the future, but a previously-retrieved key should keep on working.
Here are the scripts:
Key-retrieval script: ineptkey (version 5)
Decryption script: ineptepub (version 5.2)
To use, install Python 2.6 (and on Windows PyCrypto), run the key-retrieval script, then run the decryption script using the retrieved key.
And on a preachy note, please don't be a jerk with these. DRM is bad, but piracy is wrong kids, and only validates the opinions of those who think they need DRM in the first place.
Edit: script links will change reflect dropped pastebins and new versions.
Tuesday, February 17, 2009
“Not for Distribution”
Don’t you love it when documents marked “Not for Distribution” and “<Company> Confidential Information” end up indexed by Google? Even better is when those documents are served up by the company in question. And so for now we have the following for Adobe Content Server 4: the Quick Start Guide, User Manual, and Technical Reference Manual. Mostly sausage factory stuff, but there's some helpful-looking info on ADEPT.
And we are live in 3... 2...
This is still another reverse engineering blog. Watch this space for updates...
Subscribe to:
Posts (Atom)