John Maxwell, Simon Fraser University
Don't underestimate the web
In 2010, if you ask the average publisher, “What is the Web good for?” you'll probably get a response that includes these sorts of things:
But what about editorial process? production workflows? If you're like most, you probably don't think of it that way. For even though we've all been working digitally for a couple of decades, we've grown up in a world where ‘serious’ publishing processes are oriented to print on paper. So we begin with word processors; do our editing and composition markup on paper or paper-like software; produce galleys and page proofs, bluelines, and so on. If at some point along the line we need to move content onto the Web, it's usually an afterthought.
But the Web is twenty years old this year. It's not really a new technology anymore. In the past decade, the scale and scope of web publishing has grown to global proportions. Google indexes something like 50 billion pages; every business in the world has some kind of web presence; we turn to the web for almost all our day to day tasks, from personal communications to research to organizing our work and private lives. It is not too much of an exaggeration to say that in the 21st century, all content lives online—the exceptions to the rule are very few and far between. But probably, your publishing workflow is one of those exceptions.
The reason for that likely isn't about security; just because things are on the Web does not mean they're wide open for the taking. Think of your online banking, your airline bookings, your GMail. There reason publishers don't do their work online has more to do with a legacy of software and workflow processes that are still solidly mainstream, even if they ignore both the overwhelming popularity of the Web as a work platform and even a host of functional benefits the Web offers.
Over the past decade, ‘web publishing’ tools have grown into robust tools for facilitating complex editorial and production processes: multi-authored site, distributed pools of authors editors, and multi-mode outputs. Much of what web content management systems (CMS) support isn't so different from traditional publishing processes, except that they assume an entirely different set of underlying tools an technologies. Instead of Word, web publishers write and edit using online tools; instead of e-mailing document versions around, they use the versioning capabilities in a Web CMS; instead of Quark or InDesign, designers create CSS stylesheets and templates. These tools are no longer the poor cousins of their ‘professional’ print counterparts, but instead comprise an alternative toolchain.
Once we accept that online tools are capable of handling serious publishing tasks, we start to realize that there are considerable advantages in the online way of doing things: materials are available to an entire team simultaneously, wherever they may happen to be physically; version control and revision tracking can be centrally managed, instead of the mess of files that typically exist in different versions in different places; and—what should be the deal-maker all by itself—we can work in open file formats rather than the proprietary word processing and DTP files that make format conversions so difficult and which ultimately turn into an archival rubbish heap.
At this point you may say, this is all fine and good for Web publishing, but we produce books. What good are Web publishing tools in a book production context? And so we end up where we are today, where books exist separate from the web, and where any attempt to reach out to the web or to digital formats is confusing and difficult.
But what if we turned the process around? What if we embraced the Web for most of the editorial, production, and management process, and thought of print output only as a final destination?
The default process for most publishers today is the print-oriented one. When web or digital content is required, it typically comes from an export step near the proofing stage, from Quark or InDesign. We propose to reverse this, so the the process begins and stays online, with the print version coming from the export step at the end. You could think of this as a “just-in-time” production model, in which everything is kept in a relatively fluid, easily editable form, with print proofs generated on demand.
Such a model is well-suited to 21st-century market environment; web content for marketing and production purposes is easily created; digital products like ebook formats are easily created; print-on-demand production is facilitated. But it is also critical not to lose any functionality of quality from the traditional print production context—no one wants to sacrifice quality in order to serve online needs.
A Simon Fraser University's Master of Publishing program, we have been prototyping such a workflow, with a robust online editorial environment and a straightforward path to well-known print production in Adobe InDesign. Moving to such a system requires some changes both in the tools and how we use the—but more importantly, in how we think about the process.
For instance, authors are still going to use MSWord; they have been using Word for decades and probably for decades to come—for better or worse. So our otherwise web-native process accepts Word as a starting point, but we try to get out of it as early as possible. This means importing Word content as straightforwardly as possible, but not using it as an editorial tool.
Instead, we handle editing online with a tool called TinyMCE, which is a web-based WYSIWYG editor which you have probably already encountered—even if you've never heard its name before—as it is built in to most common blogging software. TinyMCE is free software, ubiquitous, and robust. It is also nicely extensible and customizable for particular editorial environments.
TinyMCE sits on top of a web-based Content Management System (CMS) which provides access control (e.g., only editors are allowed to edit), version control, revision tracking, and so on. Our early prototyping at SFU used a very simple wiki to provide these functions, but we are currently experimenting with using WordPress, the immensely popular blogging and web-publishing system. More feature-rich CMS tools like Drupal or Joomla would also work, but we've so far avoided these, sticking with the simplest systems rather than becoming mired in complex web development (and requiring complex web developers!).
Regardless of the CMS used, the end point is clean, edited HTML content. This becomes our master content source. It can be edited at any time, but also serves well for the long term because it is an open, transparent file format (unlike, say, old Word or Quark files).
Having the master content store in HTML leads to very easy conversions to published formats. Publishing material on the Web itself—in whole or in part, for whatever reasons—is easy, because it's already on the web; it only needs to be placed in an accessible location.
Ebook publishing is easy, because most common ebook formats (especially ePub) are, at heart, HTML themselves. Since the content is already in the right kind of format, they need only be wrapped up in the appropriate packaging.
Print output—particularly the high-quality output you will likely insist on—would seem like the most difficult one, as there has historically been no good ways to print web-based content. While there do exist automated web-to-PDF generation tools available, they tend to be pretty alien to most book publishers' ways of working—and you wouldn't necessarily trust them with your high typographic standards.
The solution to the print output problem was effectively solved by Adobe with the release of the Creative Suite 4. In CS4, Adobe Indesign has an open XML-based file format called IDML that is fully equivalent to the native .indd file format. The result is that Indesign can be created outside of Indesign—or that external content can be merged easily into an Indesign-based production process.
Because Indesign's new IDML file format is XML, and the web's HTML is also XML, what is required is an XML-to-XML conversion—a transformation. Tunring one kind of XML into another kind is a straightforward, unambiguous process, and achievable with free scripting tools. At SFU last year, we developed an XML transformation to convert web-based HTML content into Indesign-ready IDML. There are potentially many ways to approach this, but our method simply converts web-based content into a “story” that can be “placed” in an Indesign template, inheriting the layouts, master pages, styles, and colour definitions defined in InDesign in your normal production process.
This HTML-to-IDML conversion takes just a second or two, which means that print proofs can be generated so quickly and easily that they become “disposable.” That is to say, editorial changes that are spotted at the print proofing stage can be done in the master web-based copy (where they should be) and a new Indesign proof re-generated in a few seconds. The result is that you don't save the Indesign layout as the archival version; it is merely a stage on the way to the final print output (or PDF). The content master is the web-based one.
Working with Indesign means that your existing print production people retain control over the quality of your publications, because they are working with the same tools and techniques they're comfortable with—they control pagination, widows and orphans, copyfitting, layout, and so on. What has changed is simply the way the content finds its way into Indesign in the first place: it's coming from the web, not from a tangle of Word files. Of course, not all books are the same, and the relationship will be different for a novel than, say, a lavishly illustrated coffee table book. But in both cases there are advantages in the content residing on the web for editorial and archival purposes.
The web is not going away. It is already the dominant publishing platform on the planet, and it shows no signs of going away any time soon. And as time goes on, the toolsets and methodologies for writing, editing, and producing content on the web get more and more sophisticated. It is not concidental that ebook formats largely build on existing web technologies. It seems safe to say that book and magazine publishing in the decades to come will not exist despite the web, but rather in partnership with it.
The opportunity, then, seems to be to build bridges between the online world and the traditional world. By moving publishing workflows to a web-first stance, we take full advantage of a networked environment and the innovations that arise there. There is no need to lose our high-quality print production systems; rather the opportunity is there to better integrate these two ways of working, and even to do it cheaply and easily, working with the toolsets we already know.
For more information about Simon Fraser University's research and development in this area, see http://thinkubator.ccsp.sfu.ca/wikis/xmlProduction