From ScienceWriters: A brief guide to self-republishing

By Jeff Hecht

We’ve heard a lot about self-publishing new books. But what about self-republishing out-of-print books? Having some time and some available books, I tested the process, and came to the conclusion it can work, but not for all books, and not in the formats used by e-book readers unless you have a clean digital copy. This article shares what I’ve learned the hard way to save you time and trouble.

The starting point — the old book

In principle, any book is suitable for self-republishing. In practice, your chance of success depends on your starting point and your desired end point. Your starting point is the original book, and the key issues are whether you have a good digital copy, how complex the text format is, and the use of graphics.

If you have a digital copy that incorporates all the copy editing, and if the book is entirely text, such as a novel, count yourself very lucky. Most self-publishing services have software that can prepare it for republication with a minimum of fuss and bother.

ScienceWriters cover winter 2015-16

More likely, your digital copy may not include all the copy edits. Your book may include graphics, which must be rearranged on pages for an e-reader edition. You may have to change page formatting in other ways. Or you may not have any usable digital copies, as was the case for all four books I have self-republished. I will focus mainly on scanned books.

Book scanning and scannos

Printed books can be scanned, and that’s a job for a professional service that has special equipment to scan pages quickly, carefully, and cleanly. I used BlueLeaf Book Scanning for three of my books. They generate PDF page images and run them through OCR (optical character recognition) software that recognizes the printed characters, generating both a plain text file and a separate Word file that includes fonts and formatting such as bold, italics, superscripts, and subscripts. You also get a searchable PDF (which includes an embedded text file) and can get extras including a PDF configured for online publishing, a file of images in the book, and a version of the text formatted for e-readers. The extras are not expensive, and worth getting for possible future use.

Although OCR is remarkably good in many ways, errors that I call “scannos” are inevitable, particularly for font and type style. The OCR built into Adobe Acrobat Pro repeatedly set the font of “If” at the beginning of sentences in Arial rather than Times New Roman. It also couldn’t tell a superscript from a smaller font size raised above the baseline, causing problems when preparing e-reader files. Text errors were common for certain letter pairs, notably “rn” which the OCR often mistook for the letter “m,” converting “burn” into “bum.” A spellchecker can catch some spelling errors, but won’t spot the wrong font or legitimate words, like “bum.”

Illustrations are a problem. Black-and-white scanning gives clean sharp line art and letters, but makes an ugly mess of photos or other grayscale art. Scanning in grayscale creates shaded areas that make the page look blurred. Look carefully at your book before you choose which to use. In theory you can scan grayscale images separately and paste them into the black-and-white scan, but that will cost you time and/or money.

Self-republishing options

The three main options for self-republishing are print-on-demand (POD) of paper copies, distribution of electronic files in E-reader format, or distribution of electronic files in PDF format.

POD books are paperbacks printed on demand from image files, usually stored in PDF format. Many conventional publishers use POD to keep backlist books in print. These are essentially photocopies of the original printed book and may be taken from original digital layout files. But as in any photocopying, the print quality suffers, and if the book has been scanned, sharpness of lines and letters often suffers. The costs of printing and distribution also cut into your share of sales income. POD has been around since about 2000, and is a well-established part of the print publishing and distribution system.

Electronic e-reader format books are electronic files formatted to be read on e-book readers like Kindle and Nook. Like the HTML code used on web pages, they combine text, formatting codes, and images and flow the combination into the available screen space. The two main formats are MOBI (used by Kindle) and ePUB (used in most other e-readers). Books in that format benefit from an extensive marketing and distribution infrastructure. However, accessing that infrastructure requires converting the scanned book into a word-processing file, and that can be a major project if your book includes illustrations, references, or formatted text. E-reader books have become an important market in recent years, but their success has been uneven, with sales high in genre fiction but low in illustrated and professional books.

PDF E-books are electronic files generated from scanned images or by printing a word-processing document to a PDF file. The scanning and OCR used in self-publishing generates two-layer PDFs. The reader sees the image file, but also can search a hidden OCR-produced text file. The big advantages are ease of generating PDFs, low distribution overhead, and high royalty rates. The main drawbacks are that major e-book retailers generally don’t sell them and that they don’t display well on smartphones and e-book readers. However, PDFs are well-accepted for many types of publishing.

Processing, pitfalls, and tradeoffs

The easiest way to self-republish a scanned book is to post the PDF at a service that specializes in electronic distribution. Sometimes you can post the scanned copy without change, but you may have to make some minor changes, such as removing the name of the previous publisher from the copyright page.

Directly editing pages in PDF files is difficult, but you can make changes on individual pages from the text version in a word processor, then print the pages to PDF and replace individual pages with the new versions. Apple’s Preview program is easier to use than Adobe Acrobat Pro, and is available for free. (I don’t do Windows.) Be careful in looking for PDF editors; many programs called “editors” are designed for creating or filling out forms, and can’t edit page contents.

The major practical challenge is finding a suitable service, because Amazon and other big e-book companies prefer selling e-reader formats. I chose Payhip.com and Google Books. Payhip is easier to use and offers higher royalties, Google has better name recognition, but is awkward to use. Payhip has little search-engine presence, but my website has a page devoted to the book with a link to Payhip, and it ranks second on a Google search for my best-selling book by my name and its title. (Google Books is in fifth place.)

Self-republishing in POD is almost as easy because it also uses scanned PDF images. POD services store digital images of the book, and print copies when needed. Two basic classes of service are available: do-it-yourself where posting is nominally free or full-service companies which charge for posting and formatting. I chose do-it-yourself, but looking back I think paying for full-service would have saved considerable time and trouble.

The two major do-it-yourself options are Createspace (an Amazon subsidiary) and Ingram Spark. You post the files on their site, their software checks that the files meet quality and content standards, and you revise as necessary until they’re satisfied. Then they sell copies through on-line and brick-and-mortar bookstores and pay you royalties based on sliding scales that depend on the price you select. I chose Createspace because it offered a simpler process.

Preparation was simple and easy for the 160-page young adult science book. Createspace’s quality scan caught a couple of minor problems that were easy to fix, and with minimal effort the book was on sale.

In contrast, preparation of my 800-page introductory textbook on fiber optics was difficult and frustrating. The first problems I encountered were scannos that introduced bogus fonts into the PDF file. I was able to fix those with an evaluation copy of Adobe Acrobat Pro, but the process was very tedious and timeconsuming. However, I had to pay $149 for Createspace to adjust the position of the printed area on the pages to allow for the wide gutter (binding margin) needed for an 800-page book, because Acrobat Pro could not fix that problem. To be fair, 800 pages is close to the maximum size for POD, but Acrobat Pro was very disappointing.

Self-republishing for e-readers is a far more difficult problem because you have to convert the whole book into a new format. That means you have to check the entire book for text scannos, reformat it to make the fonts uniform, rearrange the art so it falls between paragraphs and within page boundaries, clean up any scannos introduced into the art when the OCR tried to read labels, and clean up references and formats such as subscripts and superscripts. That also means you have to fight Microsoft Word’s default settings, which do very odd things to bulleted lists or numbered lists of references. It also means you have to learn about other arcane formatting tools in Word, such as “styles” for paragraphs, headings, references, and so forth. Essentially, it turns you into a production manager charged with typesetting a book submitted in very messy format.

I did that for one book, using Acrobat Pro’s OCR to produce a Word file from a PDF I already had. That book included end-note references, bulleted lists, line drawings, photos and other gray-scale illustrations, and lots of subscripts and superscripts. Apple’s iBooks Author app did a poor job of importing the Word file, so I spent long hours editing the Word file for submission to Createspace, which converted it to MOBI for Kindle. I also submitted the edited Word file to Smashwords to format for other e-readers.

Is self-republishing worthwhile?

Was the whole effort worthwhile? Yes, for the fiber-optics book. In the first five months, I’ve sold over 100 copies, three times what the previous publisher sold a year earlier. That’s thanks to cutting the POD price to $39, about one-fourth of the previous publisher’s list, and selling the PDF for $9.95, even without much promotion. Originally published in 2006, the book is dated, but still usable as an introduction to the field, and I’m planning a new edition.

Yes, for the 160-page young adult book. Originally published in 1987, it’s old enough that I didn’t feel right charging for the PDF, so I’ve posted it for free at Payhip. A few people have bought the POD edition, and I’m thinking about an update.

In contrast, the e-reader edition of the third book, from 1984, took much more time than it was worth. I added a short epilogue, but the book remains largely of historic interest and sales are near zero.

The bottom line is that you should take a long, hard look at the market before you self-republish. POD and PDF will keep your old books available with little effort, but don’t expect to make much money. Unless your book is text-only or you start with a complete digital version, republication for e-readers will take much more effort, and is unlikely to be worth your time. As an experiment, I am self-publishing a longer version of this guide.

Jeff Hecht is science and technology writer based in the greater Boston area.

(NASW members can read the rest of the Winter 2015-16 ScienceWriters by logging into the members area.) Free sample issue. How to join NASW.

ADVERTISEMENT
Knight Science Journalism @MIT

ADVERTISEMENT
Stanford Center for Biomedical Ethics