A real record of system changes — infrastructure, features, and hardening. No marketing, no invented milestones.
Model Router — complexity tiers + timeout hardening
Architecture
soli-model-router routes requests to local model tiers (fast / default / heavy) based on complexity classification. Timeout policy aligned to real cold-load times on local hardware.
soli-model-router: routes by complexity tier, no cloud dependency in alpha
Per-lane timeouts: fast 30s, default 120s, heavy 240s
_classify_complexity_tier() is now the single source of truth in soli-core
Fallback chain: router failure falls back to direct Ollama, never blocks the user
Trial lead conversion — 409 conflict flow fixed
Hardening
Duplicate-email conversions now return a structured 409 with the existing user's ID, allowing the admin console to auto-fill the link_user form instead of dead-ending.
identity-access: pre-check before INSERT to avoid Postgres constraint race
Structured 409 body: code, message, and existing_user with id and email
control-plane: re-raises 409 verbatim instead of swallowing as 200
admin-console: auto-switches to Link User mode with user_id pre-filled
Privacy consent layer
Security
soli-privacy-consent manages user data-sharing preferences. Privacy class signals are propagated through the routing layer.
soli-privacy-consent service: store and query per-user consent state
privacy_class field wired into model routing contract (local_only in alpha)
Consent signals accessible via control-plane admin layer
Admin layer — operator console + audit
Feature
soli-admin-console provides operator access to system state: users, trial leads, entitlements, audit events, and security actions. All mutating operations are audit-gated.
soli-control-plane: orchestrates writes across domain services
soli-audit: append-only governance log, fail-closed on convert operations
Trial lead conversion flow: create_user and link_user actions
Entitlements + metering layer
Architecture
Per-user feature access is managed by soli-entitlements. Usage events are recorded by soli-metering. Admin overrides are available through the control plane.
soli-entitlements: feature flag store per user, admin override support
soli-metering: usage event recording, token and request tracking
Extension access controlled by entitlement tier, not hardcoded
Self-hosted deployment on local hardware
Infrastructure
SOLI runs on a Mac Mini (Apple Silicon). All AI inference stays on local hardware via Ollama. No data leaves the host in the alpha configuration.
19 services orchestrated with Docker Compose
Ollama for local inference: llama3.2 (fast), qwen3 (default/heavy)
Cloudflare Tunnel for secure remote access without port exposure
Health dashboard for live service status
Identity, auth, and user management
Foundation
soli-identity-access handles user accounts, JWT auth, password reset, and trial lead intake from the public landing page.
soli-identity-access: user CRUD, JWT, bcrypt password hashing
Trial lead intake: public form → admin review → convert to user
Password reset flow with time-limited tokens
Anti-spam: disposable domain blocklist on lead intake
soli-core — conversation engine
Foundation
soli-core is the central conversation engine. It handles chat requests, memory, extension context, and Ollama model execution.
Chat endpoint with domain routing and model selection
Memory layer: user-specific context stored and recalled per session
Extension context injected into prompts for health and other domains
Audit logging for all chat interactions
Project initiated — late 2025
Transparent development. Major changes documented here as they ship.