Electronic publishing/editing services
for hybrid paper or online publication

Last updated: February 8, 2011
Antenna House, Inc.

Slide

Speech at Page 2011 February 3, 2011 12:30-14:30 G3 [Future of eBook and the use of EPUB format]
☞  PowerPoint Slide(PDF file)

The following is a note prepared for the speech.


Purpose of today's speech

  1. First of all, I'm going to tell you that creating books for printing and e-book production workflow is inherently different, that we must reconcile this.
  2. Next, I will tell you what to do to fix it.
  3. I will introduce "Cloud Authoring Service for Unversal Book (CAS-UB)" we are developing now as one of the solutions together with the demonstration.

Company profile

I will explain our company profile, subsidiaries and the locations.

XML Project

I'm going to explain XML related project among Antenna House businesses.

Let's write a book/ Let's publish a book

Publishing is an important mechanism for the development of society as a way to transfer information and spread ideas and thoughts. It is only a democracy society that we can freely express the thoughts in our language and can share them with a lot of people.

As for the publication industry, the sales amount would be two trillion yen or less in total even if the book and the magazine sales are added, and social importance might be in no way inferior to the automobile industry, though it is insufficient for sales of one automaker company.

The story that the population of Japan started decreasing greatly in history was introduced in the book "State of chaos in the transition period/End of the economic growth myth" (by Katsumi Hirakawa, Chikuma Publishing, November 10) that had been published last year. The population decline has a big, negative impact in the industry that the number of reproduction distributions like publication connects directly with the profit. Although the overseas market is thought as a measurement for population decrease in manufacturing and the software industry, the publication is language dependence of Japanese, the internationalization is difficult. It is thought that publishing the book in Japanese language will gradually become difficult the way things are going.

I believe CAS-UB allows authors and editors to issue their ideas more freely and easily in the form of various e-books and to contribute to the society in Japan.

EPUB overview

I will explain the EPUB mechanism briefly.

EPUB2.0 mainly expresses the article contents with XHTML and specifies the layout with CSS. Among the properties (tag) that are defined in XHTML, properties, elements that should be supported by the EPUB Reader have been decided, more over, properties that should be supported with CSS2 have been decided.

Both XHTML and CSS are standards of Web and the production of the EPUB article is considerably close to the production of the Web page.

In EPUB 3.0. XHTML will change into XHTML5, a part of level 3 will be adopted to CSS, the layout expression ability will increase. It is said that the format of table of contents will also change.

Though today's story is based on EPUB2, it is also applicable even if it becomes EPUB3.

Differense between EPUB and Web

EPUB also has the respect different from Web. The following shows the differences.

Table of contents are expressed by the hierarchized XML format, called NCX file. EPUB Reader reads NCX and displays contents on the original screen. You can jump to the body text from the table of contents entries.

One book is composed of numerous articles. Each article is an XHTML file. EPUBReader has also a low-priced special terminal, for instance, SonyReader seems to bankrupt when big XHTML is sent to. Then, you need to split the XHTML file into small unit. The Spine section lists the reading order of the contents. EPUBReader traces and sees the XHTML file in order of the description in Spine.

It is packaged with the OPF format and zip archived. EPUB should be closed in the package in order to be read in the environment of non-connection to the Web. Then, the relative description of links and the references are preferable and the absolute link might not be displayed.

It would be hard to create OPF, Spine, and NCX, etc. manually, then we use a special tool for EPUB own use. The EPUB tool, such as Sigil would often be used with the manual work.

Flow of EPUB creation using DTP

Currently, DTP such as InDesign would often be used for the book production.

For instance, EPUB is recommended as a e-book format in the book "How to make an e-book" written by Uji Sakai (Gijutsu-Hyohron Co., Ltd.) In this book, the EPUB creation work flow is introduced. (1) Starting from the book data made with InDesign. (2)transforming data from InDesign into Dreamwaver (Web editing tool) and create a well-formed XHTML with Dreamweaver. (3) Next, creating EPUB by copying and pasting the content of each XHTML to Sigil.

The data processing by hand would be required for creating EPBB from DTP data like this. However, it is obvious that the productivity is not very good with such work flow of copying and pasting.

WYSIWYG program as mainstream for paper book production

The WYSIWYG production method by DTP is a mainstream for creating documents for print now. WYSIWYG indicates that the layout on the screen and the result of the printed paper is the same.

In printing, the layout is designed on the fixed size page and the object is nicely arranged. The same is applied for the book. The size of finished dimension is previously decided. Then, the paper size is decided and the objects such as text, images, and the table, etc. are arranged. The number of entire pages is fixed and contents and the index, etc. can be navigated by the page number. Not only this but also the mutual reference in the article often uses the page number. The size of the fixed page is always premised in printing like this.

EPUB is not a paged media.

The essential difference between the printed book and EPUB is that EPUB is not page fixed.

EPUB is read with a reading terminal. The area size where contents can be displayed is different depending on the resolution of the reading terminal and the aspect ratio of one screen. In addition, the font size when displaying can be changed with EPUBReader. The total number of pages changes as the number of characters displayed on one screen changes if the font size is changed. Therefore, the number of pages is not uniquely decided.

For this reason, even if an exact layout is done on the page in EPUB, the layout collapses when displayed. The navigation support that shows the page reference is meaningless.

In EPUB production, prepare contents with the layout that doesn't depend on the page size. Then, EPUBReader layouts this on the display equipment and display it. The layout specification features depend on CSS2 because EPUB uses CSS2 for the layout. In fact, only a part of the layout defined by CSS2 can be specified and it depends on how finely EPUBReader can display the layout specified by CSS2.

Thus the idea of the layout must be changed from that of print layout drastically.

Having both ways of print production and EPUB production

The layout characteristic is quite different between the book printed on paper and EPUB as above-mentioned. The work flow of production is considerably different. However, the main earnings of the publication seems to be paper publishing for a while. Therefore, if only EPUB is targeted and we establish a production work flow that is independent of paper production, it will be hard to make earnings only with it.

Then, it is necessary to construct the mechanism that has both ways of paper and EPUB productions, making it possible to output two media at the same time with the minimal cost.

Then, consider what we should do. This can be solved by separating contents and the layout by using XML conceptually. The challenge described here is a specialty of the XML document production technology.

However, the XML document production system has not been very much widespread up to now. Because it costs a large sum of money to develop and the scale of the system expands. Only when a large-scale investment was available, like the manual production system of the big enterprise, it could be achieved.

It had been thought up to now that it was too inapposite to apply the large scale one like the XML document system to the book field. However, I believe EPUB enables us to apply the XML document technology also to the book field.

CAS-UB is a solution for the book editing and production by making good use of the XML technology.

What is the universal book?

The original book for producing a paper book and EPUB is called a universal book. Universal book means roughly as follows:

Article authoring

The first barrier of introducing XML is the difficulty in authoring the XML manuscript.

Then, I would like to propose to use the Wiki notation for the contents markups with CAS-UB.

Creating a frame structure of the book on the Web

The book has a big structure with front matter, text and posterior matter. Furthermore, text has the hierarchical structure like part-chapter-paragraph. When creating a book, it is necessary to design and specify such a hierarchical structure.

In XML, the structure is expressed with the tree. XHTML is a flat document without the concept of the tree structure. In HTML5, a document structural element named "section" is introduced. By using the section property, the document will be made a tree structure.

However, it is very difficult for the beginner to understand the tree structure and markup this tree structure in XML.

Then, in CAS-UB, I introduce to apply a structural markup by operating the tree of the chapter and the paragraph on the Web browser. This will introduce the idea of publication class as a model of the frame structure of the book and support the structural markup.

Actually, I leant this idea from DITA (Darwin Information Typing Architecture). The article is expressed by the topic in DITA, and the topic is assembled by using the mechanism of map. EPUB is similar to DITA, too. The article is XHTML and the structure is expressed with OPF, Spine, and NCX.

Specifying the output layout by stylesheet

The third barrier of introducing XML is the layout specification.

Because the XML document is a content without the layout, the layout according to the output medium is added with the stylesheet.

As for the stylesheet, EPUB uses CSS and PDF for printing uses XSL-FO.

In CAS-UB, the ready-made theme of the layout is prepared and also customizing is available if necessary.

Collaborative editing environment

CAS-UB will provide a unique value-added ability, the collaborative editing feature on the Web base in the future.

All versions of the document can be managed, collaborative editing by multiple people is available, and also the previous versions, differences can be taken out.

This collaborative editing feature has not yet been supported, however, the version control system (SVN) is already implemented in the current service, it's almost ready.

Supporting the document processing with computer.

Information on a variety of supporting the reading has been added to the book. CAS-UB is capable of adding the following information automatically by processing date using computer.

EPUB and PDF output at once

Currently the following two output formats are supported.

EPUB2.0 is generated now, if EPUB3.0 is published, it is easy to shift to 3.0.

Comparison with similar system

We refer to the following Services when designing CAS-UB.

The differences between these and CAS-UB are as follows.

Blog publication

The chapter structure for creating a book cannot be defined. Moreover, even if we want to process the index, ruby, and notes, etc., it's impossible because the blog only shows sentences in the text.

IdeoType (Project IdeoType)

TeX and HTML base. It is excellent, but TeX is made by the old generation's batch processing and TeX has no idea of separating the contents and the layout.

Word2EPUB

There is a method to edit the content with Word and transform it into XML. There is a similar problem to DTP because Word is WYSIWYG.

DITA for Publisher

It is an open-source project that edits contents with DITA and outputs it to PDF, EPUB, and Mobipocket, etc. However, I don't think DITA is for the book because it is too complex.

Potential users and customer type

First of all, I'm thinking of having the service using at the section that creates books such as publishers and the enterprises.

Or, the private person who wants to become an author or an editor will become prospects.

Future plan

The followings are the features scheduled to be enhanced in the future.


Copyright © 1996 - 2011 Antenna House, Inc. All right reserved.
Antenna House is a trademark of Antenna House, Inc.