Textbook Hacking

A lesson about appreciating risk from technology
in today's business world.

2013-04-25

By Chris Niemira for BUSI621 at R.H. Smith School of Business, University of Maryland

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Note: Names have been changed and some images altered or redacted to protect the ignorant.

Story Time with Chris

  • A demonstration of "grey hat" hacking
  • An example of DRM failure
  • A discussion of real world implications

What is a hacker?

  • Generally, just a computer enthusiast
  • In security terms, we categorize them into different groups including:
    • Black hat
    • White hat
    • Grey hat

Black hat

  • The stereotypical media portrayal of a hacker is a "black hat" hacker
  • This is someone who seeks to bypass some form of computer security constraint either to cause damage or to steal something
  • "Bad guys"

White hat

These are good guys, often "security professionals"

Grey hat

  • These are people who operate in the middleground, obviously
  • This presentation illustrates a "grey hat" approach to searching out and exploiting a security vulnerability
  • I do not condone theft or malicious behavior
  • I already own a license to the content I'm working with
  • More @ Wikipedia

Anyone recognize this?

eBook App

It's the LearnCo eBook App

...allows eBook subscribers to access their titles on an iPad

I had this...

eBook App

Browser view

  • This is what the web-version of the app looks like
  • The eBook iPad app isn't much different

I wanted this...

PDF

Which is to say:

I just wanted a PDF that didn't require using LearnCo's kludgy software

So here's how I got it!

No, this isn't illegal...
But it is for educational use only!

LearnCo's Legal Notice

You are authorized to download one copy of the material on this Web site on one computer for your personal, non-commercial use only. In doing so, you may not remove or in any way alter any trademark, copyright, or other proprietary notice. Except as allowed in the preceding sentence, you may not modify, copy, distribute, republish, commercially exploit, or upload any of the material on the Web site without the prior written consent of [LearnCo].

What about this presentation?

  • There is nothing illegal about sharing information
  • ...and no, I won't give you the book. I bought it, so should you

Is it ethical?

  • There is nothing unethical about sharing information
  • What you do with this information, if anything, is entirely your responsibility

Step 1: Analyze

Early assumptions:

  • LearnCo probably didn't invent their own file format
  • The files are probably already in PDF format
  • The iPad app would be fairly easy to attack

Most of what hackers do...

  • Is analyze stuff
  • We observe behavior, we pay attention to interaction...
  • Like countless other professions, we look for opportunities to exploit

iPad backups are a great place to start

iPad Backup

It's a snapshot of your device

  • Files are heavily obfuscated, but not necessarily encrypted
  • If you know how to look (with the right tools) you can unlock many secrets

WebKit!

Webkit

WebKit is the engine that powers Safari

This tells me the app is really just a glorified website

Step 2: Observe

Eavesdropping on your own devices is easy

Eavesdropping?

Knowing that the eBook app is just a website under the covers reveals some things:

  1. Websites use 'HTTP' which is a relatively trivial protocol to understand
  2. HTTP uses an atomic 'Request' → 'Response' model
  3. Everything the app does should be observable

Enable "Internet Sharing" (no encryption) on a PC

share the network

Provided for reference

  • This is just how you do it on a Mac
  • This step isn't really necessary, except that I wanted a low-noise environment
  • You can actually 'snoop' on public networks, even WiFi
  • But that would require an annoying decryption step that was easily avoided

Tell the iPad to use that network

internet connection

This means that all data exchanged between the iPad and the Internet passes through the PC

Run a packet capture

tcpdump

tcpdump

  • One of the the most powerful diagnostic network tools in existence
  • Allows you to make a recording of the network "frames" exchanged between two or more devices

Download stuff from the app

download

Inside the packet capture...

PDFs!

HTTP headers

  • This is a fragment of an HTTP transaction
  • It shows the eBook application asking the LearnCo server for one page of the book
  • ...which is a PDF file, like I thought!

Step 3: Evaluate

  • The packet capture is a "recording" of the data exchanged between the eBook app and the LearnCo server
  • It has a copy of each request that was sent by the app
  • It also has a copy of each response sent by the server

Option 1: Extract the data

I could reassemble the entire book, page by page, out of the responses from LearnCo that were sent to my iPad

Pros: wouldn't have do to download anything again

Cons: keeping stuff in order might be harder

Forensic tools

  • I considered using a forensic tool designed to extract data from packet captures
  • Unfortunately, the app asks for one page at a time, and the files have 'obfuscated names'
  • Extracting the files and renaming them in the right order would have been tedious

Option 2: Extract the URLs and re-download

  • I could re-download the book one page at a time and save the pages in order
  • This is how I did it

Re-download?

  • Turns out, it's pretty easy to get just the list of URLs
  • And it's also easy to fetch them in order and name them correctly as I go
  • Since there was no security or 'cookie' to keep track of, it was very easy
  • Also makes for a better example

Step 4: Hack

The goal here is to:

  1. Generate a list of pages, in order
  2. Download each page one by one
  3. Assemble the into a single PDF
  4. Optimize the file if necessary

Wireshark makes this part easy

Wireshark

Wireshark

  • A graphical packet-capture analysis tool
  • I don't usually do GUIs, but it made for a nice screenshot

A little script to automate downloading

curl script

#!bash

  • This is a 'script' or 'very small program'
  • It runs through a list of URLs, downloads every file, and saves them with sequential numbers
  • Note that it's pretending to be the eBook app (a trivial lie)...
  • ...but that it never logs in!

Downloads in progress

output

The result

files

Pages

  • I've now got each page, in order, as a separate PDF
  • Which I assembled into chunks, then all together

Which we can easily assemble

assembly

Two things here

  1. 'pdftk' splices the 1000+ individial PDFs together into one
  2. 'gs' compresses the file to a more portable size

Producing a nice convenient file

in PDF

That I can read at my leisure

pages

Step 5: Realize

So, that's nice, you turned your finance book into a PDF

Big whoop

Right?

Vulnerability Assessment

  • Unrestricted PDF
  • Print Quality
  • Security Through Obscurity

Unrestricted PDF

  • A bad guy could redistribute this file with ease
  • It's untracable... there's no watermark

Print Quality

  • The individual PDF pages are print quality
  • This is what I'd send off to a printing press for manufacturing

Security Through Obscurity

  • Page downloads require no authorization
  • The list of pages in each book is readily available
  • The catalog of books is also available...

Ergo:

None of their publications are safe from theft!

Implications

Why it matters...

LearnCo

  • Is a multi-billion dollar public MNC
  • Values its reputation
  • (Lots of Goodwill on the Balance Sheet)

From their 2012 annual report:

LearnCo Risk Matrix

Are they under-appreciating risk?

LearnCo Priorities

Diagnostic

Speculation based on 15+ years Industry Experience

How it happens

  • Fast. Cheap. Robust.
  • Pick two

It's called the "Iron Triangle," and it bit them in the ass

Exposures

  • Usually created as a result of lack of understanding
  • Rapid Application Development (R.A.D) focuses on 'fast' (quick to market)
  • NPV/IRR project assessments tend to over-emphasize cost reduction to produce value

Which means, of course

Many businesses will sacrifice 'robustness' in favor of a project that can be completed faster and cheaper

Because

  • Our financial system incentivises leaders to think about the short term
  • The Internet is fickle. One day you're in...

Again...

These are just my own observations!

This exposure

  • Probably a lack of sufficient governance/oversight
  • Probably not "outsourced to the cheapest bidder" (LPTA problem)
  • May represent a fundamental cultural problem with their IT

LearnCo

  • Doesn't discuss IT or DRM in their disclosures except to say they think they're important
  • Has over 100+ open IT job reqs (2013Q2)

Specific technical details

Is a result of relying on commonly used 'app development model' with absolutally no security/DRM protection facility to be secure

Conclusion

  • LearnCo is not adequately protecting their I.P.
  • And they probably don't know it

Why you can't have nice things...

This illustrates why security experts like to say "it's not if you get hacked, it's when."

But it doesn't have to be that way.