Architecture¶
Owlette uses a serverless, event-driven architecture where all communication flows through Cloud Firestore. There is no direct connection between agents and the dashboard — Firestore acts as the message bus.
System Overview¶
┌─────────────────┐ ┌─────────────────┐
│ Agent │ ┌──────────────────────┐ │ Web Dashboard │
│ (Machine A) │────▶│ │◀─────│ (Next.js) │
│ │ │ Cloud Firestore │ │ │
│ Agent │────▶│ (Real-time NoSQL) │─────▶│ Users connect │
│ (Machine B) │ │ │ │ via browser │
│ │◀────│ │ │ │
│ Agent │ └──────────────────────┘ └─────────────────┘
│ (Machine C) │ │
└─────────────────┘ │
┌──────────────┐
│ Firebase │
│ Auth │
└──────────────┘
Components¶
Python Agent (Windows Service)¶
The agent runs as a Windows service managed by NSSM (Non-Sucking Service Manager). It:
- Monitors processes every 10 seconds — detects crashes, stalls, and exits
- Auto-restarts crashed applications using Task Scheduler or CreateProcessAsUser
- Sends heartbeats every 30 seconds to mark the machine as online
- Reports metrics every 60 seconds — CPU, memory, disk, GPU usage
- Executes commands from the dashboard — restart process, install software, reboot, etc.
- Syncs configuration bidirectionally between local GUI and cloud
- Runs offline — continues monitoring even without internet, syncs when reconnected
The agent uses a custom Firestore REST API client (not the Firebase Admin SDK) with an OAuth two-token authentication system.
Key directories
- Installation:
C:\ProgramData\Owlette\ - Agent code:
C:\ProgramData\Owlette\agent\src\ - Logs:
C:\ProgramData\Owlette\logs\ - Config:
C:\ProgramData\Owlette\agent\config\config.json
Web Dashboard (Next.js)¶
The dashboard is a Next.js 16 application deployed to Railway. It:
- Displays real-time data using Firestore
onSnapshotlisteners - Manages processes — add, edit, remove, start, stop, kill
- Deploys software — push installers to machines with progress tracking
- Distributes projects — sync ZIP files across the fleet
- Manages users — role-based access control with site-level permissions
- Provides Cortex — AI chat interface for machine interaction, plus autonomous cluster management (auto-investigates crashes)
Firebase Backend¶
Firebase provides two services:
- Cloud Firestore — Real-time NoSQL database for all data sync
- Firebase Authentication — User auth (email/password, Google OAuth) and agent auth (custom tokens)
There are no Cloud Functions or custom backend servers — the web dashboard's Next.js API routes handle server-side operations like token generation and email sending.
Data Flow¶
Heartbeat & Metrics¶
Agent Firestore Dashboard
│ │ │
│── presence (every 30s) ──────▶│ │
│ {online: true, │── onSnapshot ────────────────▶│
│ lastHeartbeat: now} │ Machine goes green │
│ │ │
│── status (every 60s) ────────▶│ │
│ {cpu: 45, memory: 60, │── onSnapshot ────────────────▶│
│ disk: 30, gpu: 15, │ Metrics update live │
│ processes: {...}} │ │
Command Execution¶
Dashboard Firestore Agent
│ │ │
│── write to pending ──────────▶│ │
│ {type: "restart_process", │── listener detects ──────────▶│
│ process_name: "TD"} │ │
│ │ │── execute
│ │ │
│ │◀── write to completed ────────│
│◀── onSnapshot ────────────────│ {result: "success"} │
│ UI updates │ │
Configuration Sync¶
Configuration changes from any source propagate to all others within ~1-2 seconds:
GUI ──▶ config.json ──▶ Firestore ──▶ Dashboard (onSnapshot)
◀── Dashboard (write)
──▶ Agent (listener) ──▶ config.json ──▶ GUI
Two Firebase Clients¶
This is the most important architectural distinction:
| Web Dashboard | Python Agent | |
|---|---|---|
| SDK | Firebase Client SDK (firebase/firestore) |
Custom REST client (firestore_rest_client.py) |
| Auth | Firebase Auth (email/password, Google OAuth) | OAuth two-token system (custom token + refresh token) |
| Real-time | onSnapshot listeners |
Polling + Firestore listener thread |
| Timestamps | serverTimestamp() |
REST API timestampValue format |
The agent does not use the Firebase Admin SDK or any official Firebase Python library. It uses a hand-built REST client that communicates with the Firestore v1 REST API directly. This keeps the agent lightweight and avoids the heavy google-cloud-firestore dependency.
Authentication Architecture¶
User Authentication¶
Browser ──▶ Firebase Auth (client SDK)
│
├── Email/Password
└── Google OAuth
│
▼
ID Token
│
POST /api/auth/session
│
▼
iron-session (HTTPOnly cookie)
│
Subsequent requests use cookie
Agent Authentication (OAuth)¶
Installer ──▶ Registration Code (24h expiry)
│
POST /api/agent/auth/exchange
│
├── Custom Firebase Token (1h)
└── Refresh Token (long-lived, hashed in Firestore)
│
Agent stores encrypted tokens locally
│
On token expiry:
POST /api/agent/auth/refresh
│
└── New Custom Firebase Token (1h)
More details
See Authentication Reference for the complete flow including MFA.
Security Model¶
Site-Based Access Control¶
All data is scoped to sites. Users can only access sites they are assigned to.
| Role | Access |
|---|---|
| User | Sites listed in their users/{uid}/sites array |
| Admin | All sites |
| Agent | Single site + single machine (from custom token claims) |
Firestore Security Rules¶
Rules enforce access at the database level — no client-side bypass is possible:
- Users must be authenticated
- Site access checked via
hasSiteAccess(siteId) - Agents scoped by
site_idandmachine_idclaims in their custom token - Token collections (
agent_tokens,agent_refresh_tokens) are server-side only
Offline Resilience¶
The agent is designed to operate without internet:
- No connection — Agent uses last cached
config.json, continues monitoring processes locally - Connection restored — Agent reconnects automatically via
ConnectionManagerwith exponential backoff - Metrics buffered — Heartbeats and metrics resume immediately on reconnection
- Commands queued — Pending commands in Firestore are picked up when the agent comes back online
The ConnectionManager implements a state machine:
With circuit breaker logic: after repeated failures, the agent backs off exponentially (up to 5 minutes) before retrying.
Technology Stack¶
| Layer | Technology |
|---|---|
| Agent | Python 3.9+, pywin32, psutil, CustomTkinter, NSSM |
| Web | Next.js 16, React 19, TypeScript 5, Tailwind CSS 4 |
| UI Components | shadcn/ui (Radix UI), Recharts, Lucide React |
| Database | Cloud Firestore (real-time NoSQL) |
| Auth | Firebase Authentication, iron-session, TOTP (2FA) |
| Resend API | |
| Hosting | Railway (web), Windows Service via NSSM (agent) |
| Build | Inno Setup (agent installer), Nixpacks (web) |
| AI | Vercel AI SDK, Anthropic Claude / OpenAI |