Back to Blog
Intelligence March 10, 2026 | 12 min read

Due Diligence Automation: From Manual Investigation to AI Pipeline

Replacing weeks of manual research with a structured intelligence pipeline

Prismatic Engineering

Prismatic Platform

The Manual Due Diligence Problem


Traditional due diligence for M&A transactions, supplier vetting, or investment

decisions involves analysts spending weeks manually searching business registries,

reviewing court records, checking sanctions lists, and cross-referencing findings.

The process is error-prone, inconsistent, and scales poorly when dealing with

dozens of entities across multiple jurisdictions.


The Prismatic DD pipeline replaces this manual process with a structured,

automated intelligence pipeline that produces scored recommendations backed

by traceable evidence.


Two-Phase Client/Loader Architecture


The pipeline operates in two phases. The Client phase fetches raw data from

external sources using OSINT adapters. Each source group (business registry,

sanctions, court records, beneficial ownership) has a dedicated client that

handles API authentication, rate limiting, and retry logic.


The Loader phase normalizes and persists the fetched data. Raw API responses

are transformed into canonical entity schemas and stored in PostgreSQL with

full provenance tracking. Every data point records its source, fetch timestamp,

and confidence level.



# Pipeline execution for a single entity

{:ok, case} = PrismaticDD.Cases.create(%{

name: "Acme Corp Investigation",

entity_type: :company,

identifier: "12345678"

})


# Phase 1: Fetch from all source groups

PrismaticDD.Pipeline.fetch_all(case)


# Phase 2: Load and normalize results

PrismaticDD.Pipeline.load_all(case)


Entity Schemas


The DD system operates on three core entity types, each with a rich schema:


Person entities capture name variants, date of birth, nationality,

identification documents, PEP (Politically Exposed Person) status, and

known associations. Name matching uses transliteration-aware fuzzy comparison

to handle Czech diacritics and multiple romanization schemes.


Company entities store ICO (company ID), registered address, legal form,

beneficial owners, financial statements, court proceedings, and regulatory

filings. The schema supports tracking ownership chains through multiple

layers of holding structures.


Domain entities record WHOIS data, DNS records, SSL certificate details,

hosting history, and associated IP addresses. These are used primarily for

technical due diligence on digital assets.


Scoring Engine


The ScoringEngine evaluates each entity across multiple risk dimensions.

Each dimension produces a score between 0.0 (no risk) and 1.0 (maximum risk),

weighted by configurable importance factors:


DimensionWeightDescription

|-----------|--------|-------------|

Sanctions0.30Matches against sanctions lists Litigation0.20Active or historical court proceedings Financial0.20Financial health indicators Ownership0.15Beneficial ownership transparency Regulatory0.15Regulatory filings and compliance

The composite score drives the traffic-light classification: green (low risk,

score below 0.3), amber (medium risk, 0.3 to 0.6), and red (high risk,

above 0.6).


Hypothesis Testing


Beyond scoring, the HypothesisEngine tests specific investigative questions.

An analyst can define hypotheses like "Entity X is connected to sanctioned

entity Y through ownership" or "Company financial statements show signs of

revenue manipulation." The engine evaluates each hypothesis against available

evidence and returns a confidence assessment using the Nabla framework's

epistemic/aleatoric uncertainty decomposition.


Recommendation Generation


The RecommendationEngine synthesizes scores and hypothesis results into

structured recommendations. Each recommendation includes the action

(proceed, investigate further, reject), supporting evidence with source

citations, confidence level, and suggested next steps. Recommendations

are versioned so that as new evidence arrives, the system can show how

the assessment evolved over time.


Tags

due-diligence automation pipeline scoring hypothesis ai