We can't find the internet
Attempting to reconnect
Something went wrong!
Attempting to reconnect
Building an Intelligence Platform with Elixir: Why We Chose the BEAM
The technical rationale behind building Prismatic Platform on Elixir/OTP: fault tolerance for 157 OSINT adapters, concurrency for parallel intelligence gathering, and the BEAM's unique properties for real-time analysis.
Tomas Korcak (korczis)
Prismatic Platform
When we started building Prismatic Platform, the technology choice was not obvious. Intelligence platforms typically use Python (for its ML ecosystem), Java (for enterprise credibility), or Go (for performance). We chose Elixir and the BEAM virtual machine. Two years and 94 OTP applications later, this post explains why -- and what we have learned.
The Intelligence Platform Requirements
An intelligence platform has unusual technical requirements:
High concurrency with heterogeneous workloads: 157 OSINT adapters querying different APIs simultaneously, each with different rate limits, response times, and failure modes. Python's GIL and Go's goroutines handle concurrency differently, but neither provides the per-connection fault isolation that intelligence gathering demands.
Fault tolerance at the connection level: When one OSINT adapter fails (API timeout, rate limit, authentication error), the failure must not affect other adapters. In a traditional architecture, a stuck HTTP connection can consume a thread pool and cascade to unrelated operations.
Real-time data streaming: Due diligence dashboards, error feeds, and OSINT tool execution all require real-time updates. WebSocket connections must remain stable for hours while the server processes concurrent requests.
Long-running processes: Entity resolution, graph analysis, and DD pipeline execution can run for minutes. These processes must be monitorable, cancellable, and resilient to partial failures.
The BEAM virtual machine was designed for exactly these requirements -- it powers telephone switches that handle millions of concurrent connections with 99.9999999% uptime.
Why the BEAM
Process Isolation
Every BEAM process is fully isolated -- it has its own heap, its own garbage collector, and its own failure boundary. When an OSINT adapter crashes:
# This crash affects ONLY this process
defmodule PrismaticOsintSources.Adapters.Czech.ARES do
def search(query, opts) do
# If this crashes, no other adapter is affected
case HTTPClient.get(@ares_url, params: %{query: query}) do
{:ok, response} -> parse_response(response)
{:error, reason} -> {:error, reason}
end
end
end
In Java or Go, an unhandled exception in one thread can corrupt shared state. In Elixir, process crashes are expected and handled by the supervision tree.
Lightweight Concurrency
BEAM processes are lightweight -- creating one takes microseconds and uses ~2KB of memory. This enables patterns that would be prohibitively expensive in thread-based languages:
# Query 20 OSINT adapters concurrently -- one process per adapter
tasks = Enum.map(adapters, fn adapter ->
Task.async(fn -> adapter.search(query, opts) end)
end)
# Collect results with individual timeouts
results = Task.await_many(tasks, timeout: 15_000)
Starting 20 OS threads would be expensive. Starting 20 BEAM processes is trivial. This fundamental property shapes the entire architecture of the intelligence gathering pipeline.
Preemptive Scheduling
The BEAM uses preemptive scheduling -- no single process can monopolize a scheduler. This is critical for intelligence platforms where some queries return in milliseconds (ARES lookup) and others take seconds (Shodan scan). Without preemptive scheduling, a slow adapter would block fast adapters.
Hot Code Upgrades
The BEAM supports hot code loading -- you can update modules without stopping the system. In production, this means deploying security patches and adapter updates without interrupting ongoing DD investigations:
# Deploy new adapter version without restarting
:code.load_binary(PrismaticOsintSources.Adapters.Czech.ARES, ~c"ares.beam", new_binary)
We use this sparingly (Fly.io deployments with rolling restarts are our primary deployment mechanism), but the capability is valuable for urgent patches.
What Elixir Adds to the BEAM
The BEAM provides the runtime. Elixir adds developer experience:
Pattern Matching for Intelligence Data
Intelligence data is inherently heterogeneous. Pattern matching handles this elegantly:
def categorize_finding(finding) do
case finding do
%{type: :sanctions_match, list: "OFAC"} -> {:critical, :sanctions}
%{type: :insolvency, status: :active} -> {:high, :financial}
%{type: :litigation, amount: amount} when amount > 1_000_000 -> {:high, :legal}
%{type: :address_change, days_ago: days} when days < 30 -> {:medium, :operational}
_ -> {:low, :informational}
end
end
Pipe Operator for Data Pipelines
Intelligence analysis is naturally a pipeline -- data flows through transformations:
entity
|> normalize_name()
|> query_all_registries()
|> resolve_entities()
|> score_confidence()
|> detect_contradictions()
|> generate_report()
Behaviours for Adapter Contracts
Every OSINT adapter implements the same behaviour, ensuring consistency across 157 diverse data sources:
@callback search(query :: String.t(), opts :: keyword()) ::
{:ok, list(map())} | {:error, term()}
@callback metadata() :: %{
name: String.t(),
category: atom(),
rate_limit_rpm: pos_integer(),
confidence_tier: atom()
}
Macros for Self-Registration
Elixir's metaprogramming enables the self-registration pattern that powers our adapter, agent, and academy systems:
defmacro __using__(_opts) do
quote do
@after_compile __MODULE__
def __after_compile__(_env, _bytecode) do
PrismaticOsintCore.Registry.register(__MODULE__)
end
end
end
Phoenix LiveView for Real-Time Dashboards
Phoenix LiveView is the reason our dashboards work without a JavaScript framework. Server-rendered HTML with WebSocket-based updates provides:
def handle_info({:pipeline_progress, progress}, socket) do
{:noreply, assign(socket, :progress, progress)}
end
No REST polling, no GraphQL subscriptions, no state management library. The server holds the state, and LiveView diffs the HTML.
Trade-offs and Limitations
Elixir is not perfect for every aspect of intelligence work:
Machine learning: Python dominates ML. We use Elixir for orchestration and Python (via ports or microservices) for NLP and entity resolution models.
Ecosystem size: The Elixir ecosystem is smaller than Python's or JavaScript's. Libraries for niche formats (WHOIS parsing, certificate decoding) sometimes require custom implementation.
Hiring: Finding experienced Elixir developers is harder than finding Python or Java developers. The language's expressiveness partially compensates -- less code means fewer developers needed.
Memory for large datasets: BEAM processes holding large datasets in memory can trigger garbage collection pauses. We mitigate this with ETS tables (outside the process heap) and streaming patterns.
Architecture Decisions Enabled by the BEAM
The BEAM's properties enabled architectural decisions that would be risky or impossible in other runtimes:
|----------|--------------|-----------------|
Conclusion
The BEAM was built for telecommunications -- systems that must handle millions of concurrent connections, recover from failures automatically, and never stop. Intelligence platforms have the same requirements: concurrent data gathering from diverse sources, resilience against individual source failures, and real-time delivery of results.
Elixir adds developer experience that makes these capabilities accessible: pattern matching for heterogeneous data, behaviours for consistent interfaces, LiveView for real-time UI, and macros for metaprogramming. The result is a platform that handles 157 concurrent OSINT sources, 552 autonomous agents, and 30+ real-time dashboards -- all on a single deployment.
Explore the [Architecture Documentation](/architecture/) for system design details or start with the [Developer Portal](/developers/) for contribution guidelines.