Security

Semantic Guard: The AI Firewall That Understands Meaning, Not Just Patterns

Engineering Team Feb 06, 2026 7 min read

THE INNOVATION

Traditional data protection relies on pattern matching: regular expressions that look for credit card formats, SSN structures, or API key prefixes. But language is creative. People paraphrase, abbreviate, and use context. Semantic Guard is an AI model that runs entirely inside your browser, detecting sensitive intent through meaning, with zero data ever leaving your device.

Consider this prompt: "Check the records for the patient in room 4B who had the knee replacement last Thursday."

There is no Social Security Number. No credit card. No API key. No pattern for a regex to match. But that sentence contains Protected Health Information: a specific patient, a specific procedure, a specific date. That is enough to identify an individual.

Now consider: "My social is one two three, four five, six seven eight nine."

A DLP scanner looking for the format XXX-XX-XXXX will miss this entirely. A human would catch it instantly. Semantic Guard catches it too, because it understands language the way a human does.

How It Works: AI Embeddings in the Browser

Semantic Guard uses a technique called embedding-based similarity matching. Here is the non-technical version:

Your Rules Become Vectors

You define security rules in plain English: "Block any reference to internal project codenames" or "Redact salary information." The AI model converts each rule into a mathematical representation (a 384-dimensional vector) that captures its meaning.

Every Prompt Gets Embedded

When a user submits a question, Semantic Guard converts it into the same kind of vector, a mathematical snapshot of what the prompt means, not just what words it contains.

Similarity = Violation

The system compares the prompt vector against each rule vector. If the similarity exceeds your threshold (adjustable from 50% to 95%), the matched rule fires, blocking, redacting, or warning based on your configuration.

The critical detail: all of this happens inside a Web Worker in your browser. The AI model is downloaded once (roughly 23MB) and runs locally. No prompt, no rule, and no result ever touches a server.

ZERO DATA EGRESS

Unlike cloud-based AI firewalls that must send your prompts to a server for analysis, creating a new data exposure point, Semantic Guard processes everything locally. The model runs in your browser. The rules stay in your browser. The verdicts stay in your browser.

Custom Rules for Your Industry

Every organization has different definitions of "sensitive." A defense contractor's codenames are not the same as a hospital's patient identifiers. Semantic Guard lets you define rules in natural language. No regex, no programming.

Example Rules

BLOCK"Block any reference to internal project codenames or unreleased product names"

REDACT"Redact any mention of employee compensation, salary bands, or bonus structures"

WARN"Warn when discussion involves unvetted vendor names or pending acquisition targets"

LOG"Log any query that references specific patient identifiers or diagnosis codes"

Rules are defined per agent in the Agent Studio firewall configuration. You control the sensitivity threshold. Set it low (50%) for broad coverage, or high (95%) for precision when false positives are costly.

Why This Matters for Regulated Industries

HIPAA (Healthcare)

PHI detection that goes beyond pattern matching. Semantic Guard catches contextual identifiers, including room numbers, procedure descriptions, and temporal references, that together can identify a patient, even when no single field is a "traditional" identifier.

SOX (Finance)

Prevents leakage of material non-public information. Rules like "Block references to upcoming earnings" or "Redact merger discussions" are enforced semantically, catching paraphrases that keyword filters miss.

ITAR (Defense)

Classified project references, operational details, and technical specifications are caught by meaning, even when users avoid using the "official" terms. Because processing is local, no data touches external infrastructure.

GDPR (EU Privacy)

Because Semantic Guard processes everything locally, it introduces no new third-party data processor. No Data Processing Agreement needed. No additional entries in your Records of Processing Activities. Privacy by design.

When Semantic Guard is active on an agent, the cloud-based AI Classifier is automatically bypassed. The system assumes that if you have opted into local AI protection, you do not want your prompts sent to a cloud classifier for a second opinion. This is a deliberate safeguard against misconfiguration.

Enable Semantic Guard

Define your data security rules in plain English. Lumina enforces them locally, before any prompt leaves your device.

Start Today Book a Demo

Get new insights in your inbox

No spam. Just new articles on industrial AI, data sovereignty, and the reasoning layer, sent when we publish.