Concepts & glossary
Every term in plant-floor language. Two to three sentences each. Technical docs link here on first use.
View as .mdA short glossary of every term you’ll see in Conversational Factory’s docs and UI. Each entry is two-to-three sentences in plant-floor language. Technical docs link here on first use.
For the full architecture, see the system architecture. For the runbook, see the operator quickstart.
The product
Conversational Factory
The open, local product for talking to a plant you can never reach back into — source-available, every line of the seam auditable, run on-site. Read it as three layers: Industrial Independence (IIA) is the security concept (open spec); the role spec is the shape (ingest data plane, inside historian, one-way seam, outside copy, query plane); Conversational Factory is the coherent product that fills the shape. The concept is canonical; the product is reference; every role is swappable. A hosted query plane (modelpond, SaaS) can reach into a deployment. Coming soon.
IIA — Industrial Independence Architecture
The architectural foundation Conversational Factory sits on. IIA is a separate, openly-published specification (CC BY-SA 4.0) for how a sovereign industrial system should be structured: one zone per box, identical at every level of the plant hierarchy, with security enforced by the architecture rather than by policy. Conversational Factory is one implementation of IIA — the secure one-way egress. The canonical statement of the principle lives at industrialindependence.org.
What’s running
Pools of data
The operational data a plant already holds but rarely surfaces: Modbus registers on a drive, tag tables in PLC memory, alarm and event logs, batch and recipe context, OPC tags, nameplate and parameter values. It already exists — it is just scattered across devices and protocols, and most of it is never collected anywhere you can ask a question of. Tapping these pools into one standardized record is the job of the ingest data plane.
Ingest data plane
The role that taps the plant’s pools of data into the standardized record upstream of the inside historian, with zero process side effects. Two forms, and a factory may run both: passive-ingest — observe only, never transmit or probe (example: the witness); active-ingest — poll, query, or discover to read values devices don’t volunteer, read-only, never write or control (example: discovery). The specific feed is site-dependent; the role and its read-only invariant are canonical.
Inside historian
The system of record on the trusted side of the network, in a standardized schema. It is authoritative and self-sufficient — the plant never depends on anything outside the boundary, and cutting the egress entirely does not affect it. It is fed by the ingest data plane (above); the specific feed is a site detail, the product begins where the data needs to leave.
One-way sync
The transport that mirrors the inside historian outward to the outside copy. It sends datagrams out with no acknowledgement and no return socket — there is no inbound, so there is nothing inbound to exploit. The one-wayness is a property of the transport, not a policy or a firewall rule.
Data diode
An optional hardware device on the one-way sync that makes the one-wayness physical rather than configured. With a diode in place, no software bug on either side can create a return path, because the wire itself only carries one direction. Regulated sites (utilities, defense, nuclear, pharma, water) often mandate one.
Outside historian
The standardized copy of the historian on the outside. It is designed to be lost: compromise it completely and you hold historical data and no route to anything. Because it speaks a standard schema, any client, model, or downstream system can read it without bespoke glue.
Query plane
The role that turns natural-language questions into bounded, read-only queries and composes grounded, audited answers — the front door for AI clients (often as an MCP server, hence “MCP gateway”). It is source-agnostic: it does not know or care where the data sits, and the outside copy is just one source it can be pointed at, alongside MCP servers, time-series databases, SQL historians, or indexes. Example: modelpond — a separate, generic component, not part of CF and not OT-framed.
How it’s secured
One-way by transport
The defining property of the product. Everything that crosses the boundary goes one direction over a transport with no return path; nothing comes back. This is enforced by the wire (and optionally a hardware diode), not by configuration — which is why a misconfiguration or a software bug cannot open a path home.
Built to be lost
The outside copy is built to be expendable. The threat model assumes an attacker fully owns it and asks: what did they get? The answer is a copy of historical data and nothing else — no socket, route, or interface back to the plant.
Standardized schema and read API
The copy exposes its data through a standard, documented schema and a read-only interface. Clients, models, dashboards, and clouds are interchangeable on top of it, and nothing about the boundary has to change to add or swap one. No lock-in on either side of the seam.
Audit chain
A line-per-call append-only log written by the gateway. Each line records the natural-language question, the tool dispatched, the parameters, the downstream read, and the result returned to the AI. This is how you answer “why did the AI tell me X?” — by reading the audit line.
Read-only
The AI has no write path to anything. It cannot order a setpoint change, write a register, or push a configuration — not because a policy forbids it, but because the only thing it can reach is an outside copy fed by a one-way transport. Even if an attacker controlled the LLM, the worst they could do is read a copy.
Inference
MCP — Model Context Protocol
The open standard (spec) AI clients use to talk to tools. Claude Desktop, Cursor, and others all speak it. Conversational Factory exposes the gateway as an MCP server so any MCP-aware client can ask the plant questions without custom integration work.
Model-agnostic inference
The product does not assume a particular model or a particular place to run it. Air-gapped or edge sites run specialized small models on-premises against the copy with no external connectivity; sites that allow it point any frontier or preferred model at the larger external dataset. Same data, same query surface — only the model placement changes.
Edge inference
Running a small, specialized model on-premises, next to the outside copy, with no outbound connectivity at all. This is the air-gapped path: the operator gets natural-language answers without any data or query leaving the site.
Optional MQTT egress
The outside copy can optionally be forwarded to a cloud or off-site server over MQTT in realtime, to be evaluated there as well. This is opt-in and additive — the system is fully useful with zero outbound connectivity. Reach is a choice; sovereignty is the default.
Where Conversational Factory fits
Brownfield
Plants with 40-year-old PLCs, 20-year-old HMIs, and brand-new VFDs all on the same wire, behind a strict network boundary. Conversational Factory is brownfield-friendly by design: it does not require device changes, protocol changes, or a flattened network — it only needs a historian to mirror and a one-way path out.
PERA / PERA+
The Purdue Enterprise Reference Architecture — the layered model (Level 0 sensors up to Level 4 enterprise IT) every industrial control person knows. Conversational Factory is fractal across PERA: every zone is a complete inside / one-way / outside unit for its level, identical in shape from the smallest cell to the largest plant.
Sovereign per zone
Each zone appliance is complete for its scope and stands alone. Disconnected from everything upstream it still holds its authoritative record and still answers questions through its own outside copy. The pattern repeats at every level of the hierarchy.