In Word API, two files, a template file and a filled-out file can be compared to obtain the inserted "data" and the corresponding "label (item name)" for the data.
A template file is a file that defines labels (item names) and input field.
In Word API, input fields are classified into paragraphs and tables.
Tables are classified into three types according to the format.
This is the input field for getting "Label (item name)-Data" and corresponds to the red frame in the above figure. Only the label (item name) is defined.
This is the input field for getting "Label (item name)-Data" and corresponds to the green box in the above figure. It is defined in a format where the label (item name) and the input column are adjacent to each other.
The blue frame in the above figure corresponds to the input field for getting "Label (item name)-Multiple data". Define a label (item name) in the header row of the table.
The purple frame in the above figure defines an input field in the form of a combination of the two above for getting "Label (item name)-Multiple data". The label name will be the column label (item name)_row label (item name). In the case of the above figure, the label (item name) will be the "Destination_Business trip – summary", "Purpose_Business trip – summary", "Departure date_ Business trip – summary" and "Return date_ Business trip – summary".
As mentioned above, the Word API classifies tables into three types according to its format.
This section explains how to identify the table type.
For the example of tables in this section, "with background color" indicates labels and "without background color" indicates data.
Data in "red text" indicates that the cell contents are different between two files.
The type of table corresponding to the green frame in the figure of the template file, where the label (item name) and data are adjacent to each other.
The data to be acquired is basically an associative array of "label (item name)-data”.
When labels (item names) are duplicated, it will be multiple data.
A | A_1 | C | C_1 |
B | B_1 | ||
C | C_2 |
In the case of the table above, the associative array of "label (item name)-data" is as follows:
This is the type of table in which the first row is unchanged and corresponds to the blue or purple frames in the figure in the template file.
This table type is further classified into two types based on whether the first cell of the row where the first change occurs has been changed or not.
The type of table corresponds to the blue frame in the figure of the template file.
The data to be acquired is an associative array of "label (item name)-multiple data" in the header row and data rows with the same column number.
A | B | C | D |
A_1 | B_1 | C_1 | D_1 |
A_2 | B_2 | C_2 | D_2 |
A_3 | B_3 | C_3 | D_3 |
In the case of the table above, the associative array of "label (item name)-data" is as follows:
The type of table that corresponds to the purple frame in the figure of the template file.
The data to be acquired is an associative array of "column label (item name)_row label (item name)-multiple data" in the header row and data row with the same column number.
A | B | C | D |
X | B_1 | C_1 | D_1 |
Y | B_2 | C_2 | D_2 |
Z | B_3 | C_3 | D_3 |
In the case of the table above, the associative array of "label (item name)-data" is as follows:
So far, a simple table is taken as an example to explain the table types, but in an actual table there will be cell merging.
The Word API data acquisition treats all merged cells as having the same value entered.
Take the following table as an example, in which first and second columns and third and fourth columns in second and third rows are combined.
A | A_1 | C | C_1 |
B | B_1 | ||
C | C_2 |
In the case of the table above, it is treated as equivalent to the following table:
A | A_1 | C | C_1 |
B | B | B_1 | B_1 |
C | C | C_2 | C_2 |
In the case of a table type with changes in the first row, the label (item name) and data are adjacent to each other, so the associative array of "label (item name)-data" is as follows:
Take the following table as an example, in which first and second columns in third row, and third and fourth columns in fourth row are combined.
A | B | C | D |
A_1 | B_1 | C_1 | D_1 |
A_2 | C_2 | D_2 | |
A_3 | B_3 | C_3 |
In the case of the table above, it is treated as equivalent to the following table:
A | B | C | D |
A_1 | B_1 | C_1 | D_1 |
A_2 | A_2 | C_2 | D_2 |
A_3 | B_3 | C_3 | C_3 |
In the case of the table above, the associative array of "label (item name)-data" is as follows:
The above examples are based on the case where all cells have been filled in and the number of rows and columns are the same.
In reality, there may be cells that are not filled in, or the number of rows and columns may be different.
This section explains how to obtain the "Label (item name)-Data" for these patterns when there is no change in the first row but a change in the first cell.
If a cell is unfilled, it means that no changes have occurred in the two files.
As shown in the table below, if the unfilled cell is the first cell in the row where the change first occurred, the condition is same as no change in the first cell,, but only if the cell is unfilled, it is treated as same as change in the first cell.
A | B | C | D |
B_1 | C_1 | D_1 | |
A_2 | A_2 | C_2 | D_2 |
A_3 | B_3 | C_3 | C_3 |
Unfilled cell is treated as empty data so that the column numbers match.
In the case of the table above, the associative array of "label (item name)-data" is as follows:
If there is unchanged cell in the data row as in the table below, the data is treated as pre-populated data.
A | B | C | D |
A_1 | B_1 | C_1 | D_1 |
X | A_2 | C_2 | D_2 |
A_3 | B_3 | C_3 | C_3 |
In the case of the table above, the associative array of "label (item name)-data" is as follows:
As shown in the table below, if there is an unchanged cell in the top of the data row, it is treated as same as no change in the first cell, and only the first row is acquired.
A | B | C | D |
X | B_1 | C_1 | D_1 |
A_2 | B_2 | C_2 | D_2 |
A_3 | B_3 | C_3 | C_3 |
In the case of the table above, the associative array of "label (item name)-data" is as follows:
If all cells in a row are unfilled, the row is ignored.
A | B | C | D |
A_1 | B_1 | C_1 | D_1 |
A_3 | B_3 | C_3 | C_3 |
In the case of the table above, the associative array of "label (item name)-data" is as follows:
When the number of cells in the data rows are different, they are classified into the following two patterns based on the contents of the first cell in the target row.
The target row is treated as same as no change in the first cell, and an associative array of "column label (item name)_row label (item name)-multiple data" of header rows and data rows with same column numbers is acquired.
A | B | C | D |
A_1 | B_1 | C_1 | D_1 |
A_2 | B_2 | C_2 | D_2 |
X | C_3 | D_3 |
In the case of the table above, the associative array of "label (item name)-data" is as follows:
The target cell is treated as a cell where input is omitted (empty data).
A | B | C | D |
A_1 | B_1 | C_1 | D_1 |
A_2 | B_2 | C_2 | D_2 |
B_3 | C_3 | D_3 |
The above table is treated as equivalent to the following table.
A | B | C | D |
A_1 | B_1 | C_1 | D_1 |
A_2 | B_2 | C_2 | D_2 |
B_3 | C_3 | D_3 |
In the case of the table above, the associative array of "label (item name)-data" is as follows:
In Word API, a pair of "Label (item name)-Data" is called form data.
When Word API executes data retrieval, it returns the following array except in the command line interface.
Class | Descriptions |
---|---|
ParagraphFormData | This class handles paragraph input fields and has an associative array of "Label (item name)-Data" as member variables. |
TableFormData | This class handles input fields in tables and has an associative array of "Label (item name)-Multiple data" as member variables. |
The following is an example of executing form data retrieval with non-command line interface.
The following figure shows an image of a filled file (left) and the retrieved form data (right).
The gray background and red text areas represent the written data for the template file.
The retrieved form returns the following associative array:
In the command line interface, the retrieved form data is displayed in the following format:
[Install directory]/samples/templates/Getting_data.zip contains sample files for data acquisition.