Using Microsoft Flow to Split PDFs based on Data Values

In this article, we will explain how to use Microsoft Flow to process a document within a SharePoint Library and split the document into parts based on specific values within the document. 

The document we are going to use in this example contains a mixture of single & multipage invoices. The scope of this flow is to split these invoices into individual PDF files by Invoice Number.  Below is a screenshot of one of the pages from our example document and we have highlighted the Invoice Number that we are wanting to split by.

The first step is to define the trigger for our flow and in our example we are going to Trigger the flow when an item gets created in Sharepoint & then using the Aquaforest PDF Connector we will define the values that we wish to extract.

The first step is to define the trigger for our flow and in our example we are going to Trigger the flow when an item gets created in Sharepoint & then using the Aquaforest PDF Connector we will define the values that we wish to extract.

  1. Create a new Automated Flow, “When an item is created”
  2. Specify the Location

  1. We then need to add a step to get the contents of the file

  • Specify the Site Address & also “Identifier”

  1. Add an “Aquaforest – Split PDF by Text” Step using the Aquaforest PDF Connector (see https://www.aquaforest.com/en/aquaforest-flow-doc.asp for more information)

  1. We then specify the following parameters :

  1. File Content: Sharepoint File Content Step Output
  2. Filename: Sharepoint Filename with Extension
  3. Text Select -1: “All Text in line after value”
  4. Expression Item : “INVOICE#”
  1. The next Step is to Loop around all the splits, to create the output document. To enable us to do this we need a

“Loop” Counter. So we create an Integer variable LoopCount for this :

  1. We then add a condition to check if the Split Step was successful, prior to entering the loop to create the split documents
    Add Condition Control

b. In the Value Box, select “Success”  is “equal to” – “True”

  1. Within the “Yes” Branch of the Condition, we add an “Apply to Each” Loop 

Add “Apply to Each Control”

Select an out from Previous Steps: “Split Out Files – Array of Split Files”

We then add “Sharepoint – Create File” step, with the following Parameters
Site Address: the URL of the Site, where you want to create the output files

Folder Path: The Library & Folder Path, where you want to create the output files

Filename: “Split output Files Filename”.pdf (from the Split Step)

File Content: “Split Output Files file Content” (from the Split Step)

We then increment the “LoopCount” by 1

  1. Your flow, should then look something like the below

10. To run the flow, simply put a document in the Input folder to Trigger it. Below is a screenshot showing the output from our sample invoice document. As you can see, our one input document containing many invoices has been split into 13 individual PDF documents (mixture of single & multipage), named by Invoice Number contained within each one.

Share this on...

Rate this Post:

Share: