Command-line Interface

The command-line version is a program used from the Windows command prompt. It provides the feature to convert the input docx file to an HTML file.

Conversion options

When running the command line, specify the input file name (required), the output file name (required) and the conversion option after the SBHCmd message.

The following table shows the parameters for conversion options. Specifying input and output files is mandatory, but other parameters are specified only when necessary. If no parameters are specified, the default operation is used.

Parameter Functions
<input-file> (Required) Specify the input file name.
<output-file> (Required) Specify the output file name.
-clrsettings When this option is specified, option settings already specified in the default setting file, etc. will be cleared.
-settings <settings-file> Reads the conversion option setting file specified in <settings-file>
-xhtml By default, output HTML grammar tags. If -xhtml is specified, XML grammar tags are output.
-viewport <content>

Outputs a meta tag of the following format to <head>.

<meta name=”viewport” content=” Content specified in ‘content’”>

-endl Outputs a line break at the end of the block tag.
-emptyP By default, blank lines (lines with line breaks only) in Word are ignored when outputting HTML. When this option is specified, empty <p></p> tags are output as many as the number of blank lines.
-nonrefiid While editing in Word, a lot of IDs that are not internally referenced may be created. By default, this converter scans IDs that are not internally referenced and deletes them when outputting HTML. Unreferenced IDs will not be deleted when this option is specified.
-imgwidth Outputs the width of the image.
-hstrong Ignores the emphasis specified in the heading style.
-embedimg When this option is not specified (default), images are output to the image folder.
When this option is specified, the images are embedded in the body HTML with a data URL.
-(x|o)math Specifies the output format for formulas edited in the Word formula editor.
The following four output formats can be specified:
  • Unspecified: Output formulas to <img> tags as files in svg file format.
  • -math: Output formulas to <img> tags as files in MathML format.
  • -xmath: Output formulas in MathML format markup.
  • -omath: Output formulas in Word's own Office Math format.
-throughimg Outputs the image in its original format inserted into Word.
-pstyle Outputs the style name of the paragraph by setting it as the value of the class attribute.
-citation Outputs the value of tag in the Citation field by setting it as the value of the href attribute of the <a> tag.
-textcolor Outputs the color specified for the text as <span style="color:color value">.
-italic n|t|s Specifies the output method when italics are specified for text:
  • -italic n: Do not output. (default)
  • -italic t: Output as <i>tag.
  • -italic s: Output as <span style="font-style:italic
-underline n|t|s Specifies the output method when underline is specified for text:
  • -underline n: Do not output. (default)
  • -underline t: Output as <u> tag.
  • -underline s: Output as style="text-decoration-line:underline;">.
-linethrough n|t|s Specifies the output method when strikethrough is specified for text:
  • -linethrough n: Do not output. (default)
  • -linethrough t: Outputs as <del> tag.
  • -linethrough s: Outputs as <span style="text-decoration-line: line-through;">.
-encoding <encoding> When you want to specify a character code (encoding method) other than Unicode's UTF-8 for HTML files, specify the encoding method with this parameter.
  • -encoding Shift_JIS: -encoding Shift_JIS: Output in Shift-JIS (see Note 1)
  • -encoding UTF-16: -encoding UTF-16: Unicode's UTF-16 encoding
Note 1: Because fewer character types are specified in Shift-JIS than in Unicode, Unicode characters that cannot be handled by Shift-JIS are output as &#x character_number; (character_number is a hexadecimal number). Note that the old model-dependent characters added by Microsoft to JIS X0208 (e.g., ①, ②) are treated as Shift-JIS characters.
-defstyle When this option is specified, the <style> element (element specifying the default CSS style) in <head> is not output.
-spaceindent When this option is specified, the indentation is converted to a single full-width space when one or more indentations are specified at the beginning of a paragraph.
-outputbr Instead of enclosing a paragraph in a <p> tag, a <br> tag is output at the end of the paragraph. This is invalid when -xhtml parameter is specified.
-fileimages Name the folder that stores image files as "destination_file_name.images".
-css cssfile [media] Links the CSS file. Place the CSS file in a folder on Windows and specify its path. An error will occur if the specified CSS file does not exist. You can optionally specify “media”. Outputs a link tag of the following format in <head>. <link href="xxx.css" rel="stylesheet" type="text/css" media="print"> The specified CSS file is copied to the HTML output destination folder. You can specify multiple pairs of -css and CSS files.
-js javascript-path Place the script tag in <head> and specify the path (URL) of the JavaScript file in its src attribute. No error will occur even if the specified JavaScript path does not exist.
-savesettings <settings-file> Saves the specified values of the conversion option parameters at command line execution with the file name specified in <settings-file>.
-savedefault Outputs the specified values of conversion option parameters at command line execution to the default settings file (def-settings.xml).
-split 1|2|3 Outputs the specified values of conversion option parameters at command line execution to the default settings file (def-settings.xml).
-tocout

When this parameter is specified, the table of contents inserted by the Word table of contents function when the -split parameter is specified is output as a separate HTML file (inc-toc.html).

The inc-toc.html file can be included in the split HTML file using JavaScript. inc-toc.html does not output tags such as <head> and <body> other than the tags for the table of contents.

Please refer to the following web page for a sample of how to include a table of contents using JavaScript.

www.antennahouse.com/html-on-word-samples

If this parameter is not specified, the table of contents will be output at the top of all the split HTML files.

-pagenavi language

When this parameter is specified, links to the previous and next pages are output at the top (immediately after the table of contents, if any) and bottom of the HTML file that was split when the -split parameter was specified.

If "ja" is specified in the "language" field following -pagenavi, "前へ" and "次へ" links are output in Japanese.

If you specify anything other than "ja" in the "language" field or omit it, "Prev" and "Next" links will be output in English.

If the previous or next page does not exist, each link is omitted.

-lang language

With this option, you can specify the language (lang attribute) to be output in the <html> tag of the output HTML file. Specify the language code in the "language" field following -lang. (e.g. "ja" for Japanese, "en" for English.)

If "none" is specified for "language", the lang attribute is not output to the <html> tag.

If this option is not specified, or if values other than single-byte alphanumeric characters or single-byte hyphens are specified, "ja" (Japanese) or "en" (English) is output, inferred from the Word document.

If the "-xhtml" parameter is specified, the language code specified for the xml:lang attribute and lang attribute of the <html> tag, respectively, is output.

e.g. <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="ja" lang="ja">

-spaceindent When this option is specified, the indentation is converted to a single full-width space when one or more indentations are specified at the beginning of a paragraph.
-fileimages Name the folder that stores image files as "destination_file_name.images".
-tablestyle Outputs the background color, border thickness, color, style (only some styles are supported), and table width specified for tables and table cells in a Word document using the style attribute of each HTML tag.
-v Shows the version, copyright and license information. Cannot be used with any other parameter.

Command line operation examples

The following is an example of using the command line with NewsRelease.docx as the original file name, NewsRelease.html as the destination file name, and sample.news.css as the CSS file.

SBHCmd NewsRelease.docx NewsRelease.html -css sample-news.css

If the conversion is successful, the following message is displayed and an HTML file is created.

Converting finished normally.

Running Command-line program from a Shell Script

In Docx to HTML Converter of OSDC V10.0 for Linux , the installation program will place the shell script file named run.sh in the [Install directory]. This is a sample shell script for running the command-line program SBHCmd. This script sets the necessary environment variables in the shell, and runs SBHCmd. To run the command-line program of Docx to HTML Converter of OSDC V10.0 for Linux using this script, enter the following command from your terminal window:

$ cd [Install directory] $ ./run.sh NewsRelease.docx NewsRelease.html -css sample-news.css

If the conversion is successful, the following message is displayed and an HTML file is created.

Converting finished normally.

The same parameters in the same formats apply to both SBHCmd and run.sh.