Auto-Split Record

The Auto-Split Record action allows you to split a single Record into multiple Records

This action is part of our Indexing Automation tool kit. If you have multiple documents coming in as one you can use this to split them. A typical scenario is having invoices scanned using a copier. All invoices are scanned into one PDF and imported as one entity. The Auto Split action can split them into individual records for processing.

IMPORTANT NOTES: When the split happens it is important to remember the following things:

All OCR data from the split pages stays with the pages. This means you do not have to OCR the resulting split documents again.

Each Record that is split is routed by itself. Typically this means each Record will end up going past the current SPLIT step and move along. However, remember that in workflow each item always evaluates from the top of the trigger list down. Because of this it is possible that the split Records could go down another path. Although this normally only happens when it is designed on purpose to do that.

Each Record split off will contain an exact copy of the data from the Record that was split. Every data element is copied and carried forward. If you need to change the values you can either do that in a Variable Update action or you can use the Auto-Index action.

Options

Sample Doc ID - You can test your split logic by entering the ID of a document that has been OCR'd.

Commit Split - Check this ON to split the record and save the changes automatically. If you use the automatic splitting then you will most likely NOT want to assign the trigger to any users. If you do then the workflow will stop and wait for a user to complete the step instead of the system continuing by itself. Using the Auto-Split means you must program the trigger to know how to split either by page count or OCR indexing value (see below). Check this OFF then the system will not actually split the documents. This will place split markers into the current Record's documents that are used by the Indexing Automation Split screen. In order to use this without automatic splitting you can either assign the trigger to at least one user so they can review and save the splits OR you would use the Auto-Split Commit action to commit the split automatically.

Split Type - If you are using the Invoice Processing engine you will most likely want to set this to "Invoice Split Processing" which will trigger the internal invoice splitting process. If you are not using the Invoice Processing engine (or if you want to set the split logic manually) set this to "Custom Split Process" so you can set your own split rules with the settings below.

Auto-Break Invoices - When using Machine Learning, you can choose to skip the intelligent invoice splitting logic. If all the invoices are pre-split, use this option to avoid additional splitting.

Split Page Handling - When the system determines that it needs to split you can have it keep the page where the split marker is found or discard it.

Values

You can tell the indexing engine to find splits by OCR, Bar Code or both. Use the proper fields in this section based on your needs. When a match is found then the system knows it needs to split here. You can use REGEX and LIKE values for matching. This allows you to work with patterns instead of exact value matches.

LIKE statements are NOT case sensitive. Here are the special characters you can use with some examples:

? = Any single character. Example: "SM?TH" will match "Smith" or "SMYTH".

* = Zero or more characters. Example: "SM*TH" will match "SMITH" or "SMYTH" or "SMabcdefgTH".

# = Any single digit. Example: "SMITH#" will match "SMITH1" or "SMITH8" but not "SMITH10" unless you do "SMITH##" or "SMITH#*".

REGEX means using Regular Expressions for pattern matching. The system utilizes the IsMatch() method to look for strings that match your pattern. To use REGEX in the system you must preface the pattern with "REGEX:" so the system knows you want to use REGEX logic. For instance:

"REGEX:\d{10}" will match any word that has a numeric value inside of it that is at least 10 characters long

"REGEX:^\d{10}$" will match any word as long as it is a numeric value that is exactly 10 characters long

"REGEX:[a-zA-Z0-9]" will match any word as long as it contains any single alphanumeric character

"REGEX:^[a-zA-Z0-9]{3,6}$" will match any word that is made up only of alphanumeric characters and is from 3 to 6 characters long

Split Behavior - You can configure the splitting to split any time a match is made OR only when the match value changes. It is common to have a PO number, or similar, repeated on each page. Setting this option to 'only when value changes", you can keep the pages together until the PO number changes.

Full Text / Barcode - Placing your split logic here tells the system to use the OCR data and the Barcode data to find matches.

Full Text Only - Placing your split logic here tells the system to use the OCR data only to find matches.

Barcode Only - Placing your split logic here tells the system to use the Barcode data only to find matches.

Page Count

Page Count - You can tell the system to split based on page count. If every document is 1 page then you can set this to 1 and it will split on every page.If you use this option then you probably will set Spit Page Handling to Keep as First Page.

Blank Pages

Split on Blanks - You can use blank pages to determine splits.If you use this option then you probably will set Spit Page Handling to Discard Page.

Regions (Enterprise Only)

You can program the system to look for the values (Se above) only at specific regions on the page. This will greatly reduce the chances of a false positive match. For instance, if you are looking for "Bill of Sale" anywhere on the page then a comment in the middle of a document that says "check the bill of sale" could trigger the split. However, if you limit the zone to the top middle area of the page, this should not cause a split.

Destinations

Once the split is done you can have the system move the split documents to a predetermined Record Type, Category, SubCategory and/or Name. This is useful as a way to move documents from an incoming work queue into another record type

Record Type - Name of Record Type to move the record to

Document Category - Category to move the document(s) to

Document SubCategory - SubCategory to move the document(s) to

Document Name - Name to move the document(s) to