SSE streaming (GET /chat/{char_name}/stream):
- Tokens arrive as Ollama generates them — no more waiting 10s for the
full reply. Frontend receives data: {"token": "..."} events then a
final data: {"done": true, "dashboard": {...}} on completion.
- OllamaClient.stream_generate() async generator hits Ollama's streaming
API directly; parses newline-delimited JSON chunks.
- All post-chat side effects (memory recording, relationship update,
physiology tick, planner, cache invalidation) fire as asyncio background
tasks after the stream closes — never block the response.
Redis context cache (CacheService):
- Character profile 5 min, physiology 2 min, relationships 3 min,
world state 5 min, recent memories 1 min.
- invalidate_character() called after every chat turn so stale data
never persists across interactions.
- Gracefully degrades — cache miss or Redis down falls through to DB
silently. Zero crashes on cache failure.
- Wired into main.py lifespan, dependencies.py, characters GET endpoint,
and both chat endpoints.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Full backend migrated: 8 engines, 7 API routers, 6-tier memory model,
behavior system, mood dashboard, God Mode, intro onboarding
- APScheduler replaced with Celery Beat — 7 simulation tasks now run in
dedicated worker containers, isolated from the API process
- 25 character YAML seed files with canonical schema (physiology, memory
tiers, psychology, relationships)
- Cinematic Next.js landing page with Tailwind CSS (dark theme, Cormorant
Garamond, 5 sections, character dossier cards)
- Observability stack: Prometheus, Grafana, Loki, Promtail, Jaeger, Flower
added to docker-compose with full provisioned configs
- requirements.txt updated with all deps (asyncpg, httpx, pydantic-settings,
OpenTelemetry, pyyaml, etc.)
- .env cleaned up with all required vars (DATABASE_URL, OLLAMA_URL,
CREATOR_PASSCODE, OTEL endpoint, SEED_DIR)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>