Architecture overview
IMAP mailbox
│
▼
PARSE → extract body, segment text, read attachments (PDF/Excel/Word)
│
▼
EXTRACT → find amounts, dates, document numbers, obligations
│
▼
SCORE → deterministic priority engine + optional LLM enhancement
│
▼
DECIDE → action, confidence, document identity, context type
│
▼
DELIVER → single Telegram message with priority + buttons
│
▼
STORE → events_v1 (source of truth) + SQLite metadata
│
▼
WEB UI → read-only cockpit at localhost:8787
Step 1: IMAP polling
Letterbot connects to your mail server via IMAP (SSL, port 993) in read-only mode. It polls every 2 minutes by default. UIDVALIDITY discipline prevents duplicates. Each account maintains independent state.
Step 2: Extraction pipeline
Five deterministic stages process each email:
- Body segmentation — separates main text from forwarded chains, signatures, disclaimers
- Fact collection — finds amounts, dates, document numbers using pattern matching with evidence windows
- Validation — filters false positives (phone numbers, table row numbers, invalid date ranges)
- Scoring — ranks candidate facts by contextual weight ("total payable" near a number scores higher)
- Consistency check — cross-validates (due date after invoice date, amount has currency context)
Step 3: Decision layer
Builds a MessageDecision with: doc_kind, priority, action, confidence, conversation context (new/reply/forward), and document identity. This is the canonical interpretation used by all downstream systems.
Step 4: Telegram delivery
One message per email. Priority emoji, summary, suggested action, attachment insight, and inline buttons. Edit-in-place for priority changes. Delivery SLA with fallback if enrichment takes too long.
Step 5: Event store and learning
Every decision is recorded in events_v1 — the permanent, auditable source of truth. Priority corrections feed back into sender-scoped adaptive learning. Weekly digest and dashboard read exclusively from events.