Skip to content

Create Custom Workflows#

DocuWare IDP's pre-built workflows cover the most common document types out of the box. When your use case falls outside of those, you can build a custom workflow tailored to your documents and your business rules.

This section walks you through the available workflow types and helps you pick the right one.

At a glance#

Workflow Output Annotation Training Best for
Train-your-own Splitting Model document-splitting Required Required Splitting multi-document files into logical sub-documents tailored to your data.
Train-your-own Classifier classifications Required Required Sorting incoming documents into your own categories (e.g. routing to departments).
Fine-tune Invoice Extraction extractions Required Required Improving our generic invoice extraction for the vendors you frequently receive.
Train-your-own Extraction Model extractions Required Required Extracting custom entities from documents that follow known templates, with full control over the model.
GenAI Extraction (beta) extractions Optional Optional Extracting custom entities with minimal setup — define your fields and start extracting, no annotation required.

Once you have a workflow, you can also enhance any of the extraction workflows with Master Data Matching to validate extracted values against your internal master data.

Which workflow do I pick?#

A typical IDP pipeline runs in three stages — splitting → classification → extraction — and each stage is optional. The workflow types in this section map onto these stages.

1. Splitting — turn multi-document files into individual documents#

If you receive files that bundle several logical documents (e.g. a scan stack containing an invoice, a delivery note, and a contract), use Train-your-own Splitting Model. It learns from your documents where one logical sub-document ends and the next begins.

Skip this stage if your input files already contain exactly one logical document each.

2. Classification — route each document to the right downstream workflow#

If you need to sort documents into your own categories — for example to route incoming mail to the right department, or to pick the right extraction workflow per document type — use Train-your-own Classifier. Define your labels, upload sample documents per label, and train the model.

Skip this stage if every document goes through the same downstream processing.

3. Extraction — pull structured data out of each document#

Three custom workflows cover the extraction stage. Pick based on your documents:

  • Fine-tune Invoice Extraction — use this if your documents are invoices and you want to lift accuracy on the vendors you frequently receive. You keep our generic invoice model for everything else and get vendor-specific accuracy on top.
  • Train-your-own Extraction Model — requires you to annotate a set of documents and train the model before you can use it. In return, you get a model that is fully specialised to your templates and typically delivers the highest accuracy when your documents follow consistent layouts.
  • GenAI Extraction (beta) — the fastest path to a working extraction. Define your entities with a short description and start processing — no annotation required. You can later annotate and train the workflow to substantially improve accuracy, but it is optional.

When in doubt, start with GenAI Extraction

GenAI Extraction lets you validate your use case in minutes. If accuracy is not yet where you need it, you can annotate documents and train the workflow without losing your field definitions.

Prebuilt extraction workflows

If your documents match one of our standard document types (invoices, receipts, delivery notes, vehicle registrations, …), you can also integrate the corresponding prebuilt workflow directly — no workflow creation required. Prebuilt workflows are not customisable or trainable; for any customisation needs, use one of the custom workflows above.

Validate extractions against your master data#

Any extraction workflow can be enhanced with Master Data Matching to validate extracted values against your internal master data. Master Data Matching is not a standalone workflow type — it is an optional enhancement layer that plugs into an existing extraction workflow.

Common concepts#

All custom workflows share a few common building blocks. The specific workflow pages link to these where relevant:

  • Processing: every custom workflow is called via POST /processing/{your_workflow_identifier} — see Processing API Endpoint.
  • Confidence and Human-in-the-loop: workflows return confidence values that you can use to route uncertain results to a human reviewer — see Verification.
  • Feedback: you can submit corrections via POST /processing/feedback/{processing_id} to monitor quality and (for trainable workflows) feed corrections back into training data.
  • Workflow updates: natif.ai improves model architectures over time. You will be informed in advance and can preview an upcoming update before it migrates — see Previewing Workflow Updates.