...allows eBook subscribers to access their titles on an iPad
I had this...
Browser view
This is what the web-version of the app looks like
The eBook iPad app isn't much different
I wanted this...
Which is to say:
I just wanted a PDF that didn't require using LearnCo's kludgy software
So here's how I got it!
No, this isn't illegal... But it is for educational use only!
LearnCo's Legal Notice
You are authorized to download one copy of the material on this Web site on one computer for your personal, non-commercial use only. In doing so, you may not remove or in any way alter any trademark, copyright, or other proprietary notice. Except as allowed in the preceding sentence, you may not modify, copy, distribute, republish, commercially exploit, or upload any of the material on the Web site without the prior written consent of [LearnCo].
What about this presentation?
There is nothing illegal about sharing information
...and no, I won't give you the book. I bought it, so should you
Is it ethical?
There is nothing unethical about sharing information
What you do with this information, if anything, is entirely your responsibility
Step 1: Analyze
Early assumptions:
LearnCo probably didn't invent their own file format
The files are probably already in PDF format
The iPad app would be fairly easy to attack
Most of what hackers do...
Is analyze stuff
We observe behavior, we pay attention to interaction...
Like countless other professions, we look for opportunities to exploit
iPad backups are a great place to start
It's a snapshot of your device
Files are heavily obfuscated, but not necessarily encrypted
If you know how to look (with the right tools) you can unlock many secrets
WebKit!
WebKit is the engine that powers Safari
This tells me the app is really just a glorified website
Step 2: Observe
Eavesdropping on your own devices is easy
Eavesdropping?
Knowing that the eBook app is just a website under the covers reveals some things:
Websites use 'HTTP' which is a relatively trivial protocol to understand
HTTP uses an atomic 'Request' → 'Response' model
Everything the app does should be observable
Enable "Internet Sharing" (no encryption) on a PC
Provided for reference
This is just how you do it on a Mac
This step isn't really necessary, except that I wanted a low-noise environment
You can actually 'snoop' on public networks, even WiFi
But that would require an annoying decryption step that was easily avoided
Tell the iPad to use that network
This means that all data exchanged between the iPad and the Internet passes through the PC
Run a packet capture
tcpdump
One of the the most powerful diagnostic network tools in existence
Allows you to make a recording of the network "frames" exchanged between two or more devices
Download stuff from the app
Inside the packet capture...
HTTP headers
This is a fragment of an HTTP transaction
It shows the eBook application asking the LearnCo server for one page of the book
...which is a PDF file, like I thought!
Step 3: Evaluate
The packet capture is a "recording" of the data exchanged between the eBook app and the LearnCo server
It has a copy of each request that was sent by the app
It also has a copy of each response sent by the server
Option 1: Extract the data
I could reassemble the entire book, page by page, out of the responses from LearnCo that were sent to my iPad
Pros: wouldn't have do to download anything again
Cons: keeping stuff in order might be harder
Forensic tools
I considered using a forensic tool designed to extract data from packet captures
Unfortunately, the app asks for one page at a time, and the files have 'obfuscated names'
Extracting the files and renaming them in the right order would have been tedious
Option 2: Extract the URLs and re-download
I could re-download the book one page at a time and save the pages in order
This is how I did it
Re-download?
Turns out, it's pretty easy to get just the list of URLs
And it's also easy to fetch them in order and name them correctly as I go
Since there was no security or 'cookie' to keep track of, it was very easy
Also makes for a better example
Step 4: Hack
The goal here is to:
Generate a list of pages, in order
Download each page one by one
Assemble the into a single PDF
Optimize the file if necessary
Wireshark makes this part easy
Wireshark
A graphical packet-capture analysis tool
I don't usually do GUIs, but it made for a nice screenshot
A little script to automate downloading
#!bash
This is a 'script' or 'very small program'
It runs through a list of URLs, downloads every file, and saves them with sequential numbers
Note that it's pretending to be the eBook app (a trivial lie)...
...but that it never logs in!
Downloads in progress
The result
Pages
I've now got each page, in order, as a separate PDF
Which I assembled into chunks, then all together
Which we can easily assemble
Two things here
'pdftk' splices the 1000+ individial PDFs together into one
'gs' compresses the file to a more portable size
Producing a nice convenient file
That I can read at my leisure
Step 5: Realize
So, that's nice, you turned your finance book into a PDF
Big whoop
Right?
Vulnerability Assessment
Unrestricted PDF
Print Quality
Security Through Obscurity
Unrestricted PDF
A bad guy could redistribute this file with ease
It's untracable... there's no watermark
Print Quality
The individual PDF pages are print quality
This is what I'd send off to a printing press for manufacturing
Security Through Obscurity
Page downloads require no authorization
The list of pages in each book is readily available
The catalog of books is also available...
Ergo:
None of their publications are safe from theft!
Implications
Why it matters...
LearnCo
Is a multi-billion dollar public MNC
Values its reputation
(Lots of Goodwill on the Balance Sheet)
From their 2012 annual report:
Are they under-appreciating risk?
Diagnostic
Speculation based on 15+ years Industry Experience
How it happens
Fast. Cheap. Robust.
Pick two
It's called the "Iron Triangle," and it bit them in the ass
Exposures
Usually created as a result of lack of understanding
Rapid Application Development (R.A.D) focuses on 'fast' (quick to market)
NPV/IRR project assessments tend to over-emphasize cost reduction to produce value
Which means, of course
Many businesses will sacrifice 'robustness' in favor of a project that can be completed faster and cheaper
Because
Our financial system incentivises leaders to think about the short term
The Internet is fickle. One day you're in...
Again...
These are just my own observations!
This exposure
Probably a lack of sufficient governance/oversight
Probably not "outsourced to the cheapest bidder" (LPTA problem)
May represent a fundamental cultural problem with their IT
LearnCo
Doesn't discuss IT or DRM in their disclosures except to say they think they're important
Has over 100+ open IT job reqs (2013Q2)
Specific technical details
Is a result of relying on commonly used 'app development model' with absolutally no security/DRM protection facility to be secure
Conclusion
LearnCo is not adequately protecting their I.P.
And they probably don't know it
Why you can't have nice things...
This illustrates why security experts like to say "it's not if you get hacked, it's when."