12-24 -extractImage: Extracting Images

Processing

Extracts the image data contained in the pages of the input PDF.

Note:

  • Only the image data placed on the PDF page will be extracted. This is not a process that converts the entire page into an image.

Example of commands

[Executing example commands]

Outputs the image data existing in test.pdf to the specified folder as a JPEG format file.

[Windows]

AHPDFToolCmd80.exe -extractImage C:\out -format 2 -d C:\test\test.pdf

[Linux]

AHPDFToolCmd80 -extractImage /home/antenna/sav -format 2 -d /home/antenna/test/test.pdf

Folder settings: applied

You can perform batch processing by specifying the input folder to the -d parameter.

If a folder is specified, image data will be extracted from the PDF file in the input folder. Specify the output folder with the parameter [outFolderPath].

A subfolder with the input file name is created in the output folder.

The output image data is saved in each subfolder.

Parameters

Parameter

Content

<outFolderPath>

[required]
Sets the output folder path for image files.

The output file name is "input file name_page number_ serial number."
Page numbers start with "1."
The serial number starts from "0001" and is reset for each page.

-pageNo <Val>

Set the page number to extract images from. Can be omitted.

Page number is 0 origin. Therefore, the first page is counted from "0."
If not specified, extracts images of all pages

If specifying multiple names, separate them with commas.

Example) -pageNo "0,2-4"
Extracted from page 1 and pages 3 to 5.

-format {0 | 1 | 2 | 3}

Image save format. Can be omitted.

0 = AUTO 1 = Bitmap
2 = JPEG 3 = PNG

If not specified, "0 = AUTO" is applied.

-morePPI <Val>

Can be omitted.
If this parameter is specified, images with a resolution equal to or higher than the resolution specified by <Val> will be extracted.

It can be specified at the same time as "-lessPPI". In that case, images with a resolution equal to or higher than "-morePPI" and equal to or less than "-lessPPI" will be extracted. If neither is specified, images of all resolutions will be extracted.

-lessPPI <Val>

Can be omitted.
If this parameter is specified, images with a resolution equal to or less than the resolution specified by <Val> will be extracted.

It can be specified at the same time as "-morePPI". In that case, images with a resolution equal to or higher than "-morePPI" and equal to or less than "-lessPPI" will be extracted. If neither is specified, images of all resolutions will be extracted.

-passThrough {true | false}

Can be omitted.
Whether to extract the images without making any changes [*1]. If omitted, false is assumed.

true: No changes. false: Changes.

Only valid when the image output format is set to JPEG.
An error will occur when output formats other than JPEG are set.

[*1]:
To extract embedded images unchanged, they must not have been recompressed when embedding in the PDF. Specifically, the following conditions must be met when embedding in the PDF:

* The color space must be specified as DeviceRGB, DeviceGray, or unspecified (=PDF_EMPTY_NAME).

* No mask is specified.

* The Decode value is default.

* The graphic state parameter dictionary representing the transfer function, ‘ExtGState:TR’, is not specified. (Transfer functions are generally used for gamma correction.)

Please enter alt text.