Brought to you by

Giz Explains: How You're Gonna Get Screwed By Ebook Formats

“We use the epub format: It is the most popular open book format in the world.” That’s how Steve Jobs announced the iPad. And wow, that sounds like all the ebooks you own will just work on anything. Um, no.

The idea of an open ebook format that works on any reader sounds nice. Buy it from any source, read it on any device. In a few cases, it’s true, and that open format thing can work for you. But, in reality, right now? You’re pretty much going to be stuck reading books you buy for one device or ecosystem in that same little puddle, thanks to DRM. And well, Amazon.

The Hardware
OK, so the easiest way to put this in perspective is to quickly list what formats the major ebook readers support. (Why these four? Well, they’re the ones due to sell over two million units this year, except for Barnes & Noble’s, which we’re including as a direct contrast to Kindle just because.)

• Amazon Kindle:┬áKindle (AZW, TPZ), TXT, MOBI, PRC and PDF natively; HTML and DOC through conversion

• Apple iPad: EPUB, PDF, HTML, DOC (plus iPad Apps, which could include Kindle and Barnes & Noble readers)

• Barnes & Noble Nook: EPUB, PDB, PDF

• Sony Reader: EPUB, PDF, TXT, RTF; DOC through conversion

You’ll notice a pattern there: Everybody (except for Amazon) supports EPUB as their primary ebook format. Turns out, there’s a good reason for that.

EPUB, the MP3 of Book Publishing
The reason just about every ebook uses EPUB is because the vast majority of the publishing industry has decided that EPUB is the industry standard file format for ebooks. It’s a free and open standard, based on open specifications. The successor to Open eBook, it’s maintained by the International Digital Publishing Forum, which has a pretty lengthy list of members, both of the dead-tree persuasion (HarperCollins and McGraw Hill) and of the technological kind (Adobe and HP). Google’s million-book library is all in EPUB too.

It’s based on XML – extensible markup language – which you see all over the place, from RSS to Microsoft Office, ’cause it lays out rules for storing information. And it’s actually made up of a three open components: Open Publication Structure basically is about the formatting, how it looks; Open Packaging Format is how it’s tied together using navigation and metadata; and Open Container Format is a zip-based container format for the file, where you get the .epub file extension. When you toss those three components together, you have the EPUB ebook format.

While we’ve only see EPUB on black-and-white, e-ink-based readers so far, like Sony’s Readers or the B&N Nook, the capabilities of the file format go way “beyond those types of things,” says Nick Bogaty, Adobe’s senior development manager for digital publishing. Unlike PDF, which is a fixed page, EPUB provides reflowable text, a page layout that can adjust itself to a device’s screen-size. With EPUB, content producers can use cascading style sheets, embedded fonts, and yes, embed multimedia files like colour images, SVG graphics, interactive elements, even full video – the kind of stuff Steve promised in the iPad keynote. So, we haven’t seen the full extent of EPUB’s capabilities, and won’t, until at least April 3 and presumably much later. Even if the books you buy from Apple iBook store worked on other devices – and as you will soon see, there’s little chance of that – don’t count on the coolest stuff, like video, to be somehow compatible with current-generation black-and-white e-ink readers.

But let’s not get too excited seeing the words “free” and “open” so much in conjunction with EPUB. It’s like MP3 or AAC, and not only because it’s become a semi-universal industry standard. Make no mistake, these files can be totally unencrypted and unmanaged, or they can be wrapped up in any kind of digital rights management a distributor wants.

So far, according to Bogaty, the DRM every EPUB distributor currently uses is Adobe Content Server, which conveniently also wraps around PDF files. Sony and Barnes & Noble both use it on their readers, though since Adobe’s DRM doesn’t allow for sharing books between accounts, B&N actually uses a slightly custom version, and manages the Nook’s lending feature using their own back end. (Adobe is working on a sharing provision.) It does, however, support expiration, which is how Sony’s vaunted library lending feature works.

The plus side of all this compatibility that it’s actually possible to move files from a Sony Reader to a Nook, using Adobe Digital Editions to authorise the transfer. (Though according to some reviewers, that would be like moving pelts from a dead horse to a rotting bear.)

Apple, on the other hand, chose EPUB as the preferred file format, but will be wrapping DRM’d files from its iBooks Store in the FairPlay DRM, which is used to protect movies and apps (and formerly music) in the iTunes Store. As always, expect them to be the only company using it.

(There’s a precursor to EPUB’s dilemma: Audible downloads. You can buy Audible audiobooks from an enormous number of sources, but the ones you buy from iTunes aren’t going to play on any other Audible-capable device, no matter how many logos they slap on the box.)

You may be thinking that it’s just a matter of time before ebook stores all go DRM free. That would be wishful thinking at best. While ebooks might seem a lot like digital music circa 2005, you can’t rip a book, so the only way to get a bestseller on your reader is to buy it legally, or to steal it. It’s pretty much that simple. There will be free books, there will be unencrypted books, and the torrents will rage with bestsellers (as they already do). Still, DRM’s gonna be a hard fact of life with every major bookstore, since they’re going to at least try to keep you from stealing it. You don’t see Hollywood giving up DRM, do you?

Kindle, Barnes & Noble, and How The Dead PDA Business Affects the Live Ebook War
Did you know that Amazon owns Mobipocket, which mainly targeted ebooks for PDAs and smartphones, and had its own file format that with roots in the PalmDOC format? The Mobipocket format, consequently, has two extensions: .mobi and .prc. I bring it up, not because you should care about Mobipocket – you really shouldn’t – but because the Kindle’s preferred AZW format is actually a very slightly modified version of MOBI, which is why it’s easy to convert files from one format to the other. Unprotected AZW files can be renamed to the MOBI or PRC format and simply work with MobiPocket readers.

The problem with Mobipocket is that it’s not a very capable format, since it was originally designed for ancient-ass PDAs and all. So there’s another special Amazon format that’s a little more mysterious, called Topaz, which is more capable than MOBI, with powers like the ability to have embedded fonts. It’s used for fewer books, and carries the file suffix .tpz or .azw1. For what it’s worth, some people complain books in the Topaz format are less responsive than the standard AZW files. In truth, none of this may matter if and when the Super Kindle arrives.

In terms of DRM, Amazon uses its own DRM on both formats. Both have been cracked, though it apparently took longer with Topaz. This may be good news for pirates, but matters not at all from a cross-platform point of view, since that format is completely proprietary, and nothing but the Kindle or Kindle software will read it anyway.

But the old PDA legacy crap doesn’t stop with Amazon. Palm once owned its own ebook platform, which it sold to a company who called it ereader. Eventually, the format and the software platform came to be owned by Barnes & Noble. I’m only dragging you into this because Barnes & Noble actually still sells many books in this format, even while they transition to the more popular and “open” EPUB format. You can spot an ereader format because the file ends in .pdb – but you only see that after you bought the damn thing. That is to say, even if you care enough about formats to go with the reader that supports the one you like, you still might get stuck with a limited, if not completely proprietary, stack of books.

PDF, I Still Love You
In comparison to EPUB, PDF is simple. Developed over 15 years ago by Adobe, the portable document format has been an open standard since 2008. You’re probably pretty damn familiar with it, but the main thing about it versus these other formats is that everything is fixed – fonts, graphics, text, etc – so it looks the same everywhere, versus the reflowable format that adjusts to the screen size. Hence, Amazon offers PDF without zoom on its Kindle DX, which has the screen real estate to (usually) not muck it up too much. With smaller screens than the PDF’s native size, it requires some pan-and-zoom voodoo, and it still usually looks pretty disgusting.

Zoom issues notwithstanding, having a fixed format has advantages. For instance, a lot of “electronic newspapers” were transmitted via PDF back in the day, because it retained their design. It’s really nice for comics. (Consequently, you can bet scanned-comic piracy to explode when the iPad arrives, unless Marvel and DC come up with killer strategies to get their comics on a device that’s clearly begging for it.) Wikipedia covers a lot of the technical ground, surprisingly thoroughly, even if the usual Wiki caveats apply. As mentioned above, it can be protected with Adobe Content Server DRM, just like EPUB.

The Great Shiny Hope: Apps
The other path for digital publishers: Build an app to hold your books and magazines. This is the route magazines are taking, because they’re envisioning some fancy digital jujitsu. With Adobe AIR, which is what Wired and the NYT are using in various incarnations for their respective rags, they’re able to do more advanced layouts, more rich multimedia, Flash craziness, and other designer bling that EPUB can’t handle, says Adobe’s Bogarty. Also, importantly you can dynamically update content, like when new issues arrive, which you can’t really do with EPUB.

Interestingly, the publisher Penguin is also taking the app route for their books, building apps using web technologies like HTML5 for the iPad, so their books are in fact, way more like games and applications than mere books. So it’s another tack publishers could take.

But the app business can help with the openness of the big ebook file formats, too. Many people read Amazon’s proprietary formats on their iPhone, because Amazon wants to sell books, and Apple wants people to use apps. Barnes & Noble has a reader app, too; while not great, it at least somewhat helps get over the PDB/EPUB confusion. It’s pretty likely that these and many other ebook apps will turn up on the iPad, unless Jobs decides that they “duplicate” his “functionality”. Since iBooks itself is an app you have to download, it probably won’t be an issue. Here’s hoping.

The Upshot
The idea of an open ebook format that works on any reader sounds really nice. And in some cases, if you pay really really close attention, it’s true. That open format thing actually can work for you. But the reality? You’re pretty much going to be stuck with the books you buy in one device working only in that same ecosystem, or at least hoping and praying for an assortment of proprietary reader apps to appear on all your devices. Now, where’d I put that copy of Infinite Jest? Was it in my Kindle library, my B&N library or my iBooks library?