Building a rules engine that understands plain English
"Anything from an investor is always urgent" is not a SQL query. Building a rules engine that processes natural language required a different approach.
Most rules engines work with structured conditions. IF sender IS IN list AND subject CONTAINS keyword THEN action. This is powerful for developers but completely wrong for the people we are building for.
Founders do not think about email rules in SQL syntax. They think about them the way they would explain them to a new assistant: "If it looks like an investor is following up on something, it is always urgent." "Anything from our law firm needs a reply same day." "Archive anything that looks like a weekly digest unless it mentions our company name."
We wanted to support that kind of rule. The challenge was making it reliable.
The naive approach and why it fails
The obvious first approach: take the plain English rule, send it to the AI alongside the email, and ask it to decide whether the rule applies. Simple, flexible, works immediately.
The problem is consistency. The same rule applied to the same email might produce different results on different runs. The model's interpretation of "looks like an investor follow-up" is not deterministic. Over thousands of emails and dozens of rules, this variance accumulates into behaviour that users cannot predict or trust.
Trust is the most important property of a rules engine. If users cannot rely on a rule behaving the same way every time, they stop using rules. The feature becomes useless.
The approach that worked
We split the problem into two parts: interpretation and application.
Interpretation happens once when the rule is created. The AI takes the plain English condition and converts it into a structured representation — essentially a set of signals to look for: sender patterns, subject patterns, body keywords, sender domain types, and urgency indicators.
This structured representation is stored in the database. It is deterministic and inspectable. You can read it and understand why the rule fires on a given email.
Application happens on every email. The structured representation is evaluated against the email's signals. No AI involved at this stage — just pattern matching against extracted features. Consistent, fast, and cheap.
The AI is used for interpretation, not evaluation. This gives us the expressiveness of natural language input with the consistency of structured rules.
Edge cases we had to handle
Ambiguous rules. "Important emails" is not a useful condition because importance is what the triage system already determines. When a rule is too vague to be converted into specific signals, we ask the user to be more specific rather than guessing.
Conflicting rules. If two rules apply to the same email with different actions, which wins? We evaluate rules in priority order (the order they appear in the user's rules list) and first match wins. This is explicit and predictable.
Rules that reference external knowledge. "Emails from our investors" requires knowing who the user's investors are. We handle this by letting users specify domain lists and sender names that the rule should match against.
What we are building next
The current rules engine handles the most common cases well. The next version will add:
- Rule suggestions based on observed patterns — the agent proactively tells you "you seem to always archive emails from this sender, want to make it a rule?"
- Rule testing — apply a rule to your last 50 emails and see which ones it would have matched
- Rule analytics — which rules fire most, which are never triggered and might be misconfigured