PDF

Studio Pro activities > Document Processing > PDF. Includes "Read text", "Convert to image", "Extract page range" and more.

Activities

Read TextConvert to ImageExtract Page Range
Extract the text layer from a PDF fileConvert a PDF file to an imageExtract selected pages from the PDF file to a new PDF file
Get PDF Page CountCombine to PDF
Count the number of pages in a PDF fileCombine multiple files into a single multi-page PDF

Read Text

Description

This activity extracts the text layer from a PDF file and saves it as a string.

🚧

Note

Editable fields in a PDF file cannot be read with this activity.

Parameters

Path

  1. Set a value: enables you to directly write the desired path.
  2. Save the previous step result: chooses the last result activity as a path.
  3. Calculate a value: enables you to use available properties and methods to form a path.

Display spaces and linebreaks

This parameter allows you to display /r and /n (spaces and linebreaks) in the variable text. You can disable the parameter if you do not need to display them.

Comment

This parameter contains an annotation of the activity. The input text will be displayed above the activity name.

Result

To save the result to a variable, you need to attach a "Save value to variable" activity to this activity, specify the desired variable name, and select the "Save the previous step result" option. The result is saved as a string.

Please note that in some cases this activity may not provide the same sequence of reading text as in the original document. To extract specific information from structured documents, you may use the OCR activities.

⬅️

Back to the top


Convert to Image

Description

This activity converts a PDF file to an image or to a set of images in case of multiple pages in the pdf file.

🚧

Note

Editable fields in a PDF file cannot be read with this activity.

Below is an example of how to set up and use the activity.

Parameters

File type

Select the format of the pictures created after the conversion.
Supported output formats:

  • .jpeg
  • .png
  • .tiff

PDF file path

  1. Set a value: enables you to directly write the desired path to the input pdf file.
  2. Save the previous step result: chooses the previous activity result as a path.
  3. Calculate a value: enables you to use available properties and methods to form a path.

Output path

  1. Set a value: enables you to directly write the desired path to the folder where the output images will be placed.
  2. Save the previous step result: chooses the previous activity result as a path.
  3. Calculate a value: enables you to use available properties and methods to form a path.

File prefix

  1. Set a value: enables you to specify a prefix that will be added to the names of the output images.
  2. Save the previous step result: chooses the previous activity result as a prefix.
  3. Calculate a value: enables you to use available properties and methods to form a prefix.

πŸ“˜

Note

The name of the generated file includes the value from the "File prefix" field and a counter restricted by the number of pages of the original PDF file. For instance, if the parameter "File prefix" is set with the value "Test_" and the original PDF file has 2 pages, the files "Test_1" and "Test_2" are generated in the output folder.

Pages

  1. Set a value: allows you to manually specify one or more pages to be converted. For example, you can specify pages 1,3, or 5-7.
  2. Calculate a value: allows you to use a special formula or a special method to determine the pages you need.
  3. Save the previous step result: takes the result of the previous workflow activity as the page numbers.

Comment

Contains an annotation of the activity. The input text will be displayed above the activity name.

⬅️

Back to the top


Extract Page Range

Description

This activity allows you to Extract selected pages from the PDF file to a new PDF file.

Parameters

PDF file path

  1. Set a value: allows you to manually specify the path to the PDF file from which you want to extract the page range. Click the "PICK" button to open the file explorer and pick the required directory.
  2. Calculate a value: allows you to use a special formula or a special method to determine the path.
  3. Save the previous step result: takes the result of the previous workflow activity as the path.

Output path

  1. Set a value option allows you to manually specify the path where the PDF-file is to be created. Click the "PICK" button to open the file explorer and pick the required directory.
  2. Calculate a value option allows you to use a special formula or a special method to determine the path.
  3. Save the previous step result option takes the result of the previous workflow activity as the path.

Pages

  1. Set a value option allows you to manually specify the range of pages to be extracted from the PDF document. You can specify individual pages (1,3) or range in order (5-7).
  2. Calculate a value option allows you to use a special formula or a special method to determine the range of pages.
  3. Save the previous step result option takes the result of the previous workflow activity as the range of pages.

You can specify the pages to be extracted using variables. It can be useful in the cases, where you get the amount of pages to be extracted from another activity. To do that, use the following syntax: "1-"+pdf_page_count , where "1-" specifies the page and pdf_page_count the name of the variable, containing the number of pages.

Comment

Allows you to add explanatory text to the block. The text will be displayed inside the block on top of the activity name.

⬅️

Back to the top


Get PDF Page Count

Description

This activity allows you to count the number of pages in a PDF file and return the number of pages to a variable.

Parameters

PDF file path

  1. Set a value option allows you to manually specify the path to the target PDF file. Click the "PICK" button to open the file explorer and pick the required directory.
  2. Calculate a value option allows you to use a special formula or a special method to determine the path.
  3. Save the previous step result option takes the result of the previous workflow activity as the path.

Along with this action, the "Save value to variable pdf_page_count" block is automatically created. Thus, number of pages in PDF is saved in the variable pdf_page_count.

⬅️

Back to the top


Combine to PDF

Description

This activity allows you to combine multiple files from the same folder into a single multi-page PDF document. Supported file formats: .pdf, .docx, .xlsx, .rtf, .tiff, .bmp, .jpeg, .jpg, .gif, .png, .html, .xps, .html. The files will be combined into a document alphabetically based on the file name.

πŸ“˜

Note

The activity does not scale down larger images when incorporating them into a PDF.

Parameters

Path to folder

  1. Set a value: allows you to manually specify the path to the folder that contains the files to be combined into a PDF document. Click the "PICK" button to open the file explorer and select the desired folder.
  2. Calculate a value: allows you to use a special formula or a special method to determine the path to folder.
  3. Save the previous step result: takes the result of the previous workflow activity as the path to folder.

PDF file path

  1. Set a value: allows you to manually specify the directory where the PDF file is to be created. Click the "PICK" button to open the file explorer and pick the required directrory.
  2. Calculate a value: allows you to use a special formula or a special method to determine the path.
  3. Save the previous step result. takes the result of the previous workflow activity as the path.

Comment

Allows you to add explanatory text to the block. The text is displayed inside the block on top of the activity name.

⬅️

Back to the top