🤖

Cogs - Cognitive Operating System

A privacy-first research platform for AI cognition, emotional awareness, and relationship building

đŸ‘ī¸
Face Recognition
🎭
Emotion Detection
🎤
Voice Chat
📱
SMS/Text
📞
Phone Calls
🧠
Memory System
📅
Calendar
🌙
Dream Mode
đŸ“ē
Video Learning
🔒
Privacy First
đŸŽ¯
3-Index Memory
đŸ’Ŧ
Personality Learning
đŸ“Ĩ
Data Import
💭
Emotional State

📡 Communication Channels

🎤 Voice Interaction (Local)

  • Speech-to-Text - Whisper running locally on Jetson
  • Text-to-Speech - Piper TTS with custom Cogs voice
  • Wake Word - "Hey Cogs" activation
  • Emotion-Aware - Responds to detected emotions
  • Context-Aware - Uses calendar, weather, time of day

📱 SMS/Text Messaging NEW

  • Twilio Integration - Send/receive texts via +1 (715) 567-5309
  • Whitelist Security - Only authorized numbers
  • Person Linking - Associates phone with known faces
  • Memory Storage - All texts saved with metadata
  • Privacy Fallback - Local LLM first, Claude backup

📞 Voice Calls NEW

  • Inbound Calls - Call the Twilio number to talk
  • Speech Recognition - Twilio transcribes your voice
  • AI Responses - Cogs responds via text-to-speech
  • Multi-Turn - Full back-and-forth conversations
  • Session History - Remembers context within call

🌐 Web Interface

  • Chat UI - Browser-based conversation at :8070
  • Memory Viewer - Search and filter all memories
  • Settings Panel - Configure preferences
  • Control Panel - Service management at :8090
  • Dashboard - System metrics at :8010

🧠 AI Capabilities

đŸ‘ī¸ Vision & Recognition

  • Face Detection - MediaPipe real-time detection
  • Face Recognition - Identifies known people
  • Emotion Detection - Hume AI for nuanced emotions
  • Baseline Tracking - Learns your normal emotional state
  • Shift Detection - Only comments on significant changes

🤖 Language Models

  • Local LLM - Ollama with qwen2:0.5b on GPU
  • Privacy Fallback - Uses local first, cloud backup
  • Claude Integration - Anthropic API for complex queries
  • Query Sanitization - Strips PII before cloud calls
  • Learned Facts - Stores knowledge from cloud responses

🧠 Memory System

  • Vector Database - PostgreSQL + pgvector
  • Semantic Search - Find memories by meaning
  • Memory Types - Conversations, facts, preferences, encounters
  • Source Tracking - Voice, SMS, web, video sources
  • Emotion Tagging - Memories include emotional context

📅 Context Awareness

  • Google Calendar - Knows your schedule
  • Weather API - Current conditions awareness
  • Time of Day - Morning/afternoon/evening context
  • Location - Rochester, MN awareness
  • Relationship Cards - Knows who you are

📚 Learning & Observation

🌙 Dream Mode

  • Nightly Processing - Consolidates day's memories
  • Pattern Recognition - Identifies recurring themes
  • Relationship Updates - Updates person profiles
  • Memory Pruning - Removes redundant entries
  • Summary Generation - Creates daily digests

đŸ“ē Video Learning

  • YouTube Support - "Watch this video [URL]"
  • Transcript Extraction - Gets video captions
  • Content Summarization - Extracts key points
  • Knowledge Storage - Saves learned facts
  • Topic Tagging - Categorizes content

🎧 Ambient Listening

  • Room Audio - "Listen for 30 seconds"
  • Transcription - Whisper processes audio
  • Context Capture - Understands conversations
  • Privacy Controls - User-initiated only

📄 Content Ingestion

  • Document Learning - PDFs, text files
  • Web Scraping - Learn from URLs
  • Preference Extraction - Discovers your interests
  • Fact Storage - Builds knowledge base

🧠 How Cogs Thinks

Cogs processes every interaction through a multi-stage pipeline that mimics human cognition:
perceive → remember → understand → respond → learn

1

đŸ‘ī¸ Perception - Input Channels

All channels → Dialog Service
đŸŽ™ī¸ Voice → Whisper STT
📱 SMS → Twilio Webhook
đŸ’ģ Web Chat → Direct Input
📞 Phone Call → Twilio Voice
📷 Camera → Face + Emotion
2

📚 Context Retrieval

6 parallel data fetches
fetch_person()
Name, preferences, relationship card
fetch_context()
Weather, time, day, location
fetch_emotion()
Current emotion from Hume AI
fetch_memories()
Top 3 similar past conversations
fetch_semantic_facts()
"Lives in Flower Mound" (0.95)
fetch_emotional_history()
Baseline mood, trend, last 50 emotions
3

🎭 Emotional Analysis

Shift detection from baseline

Key Insight: Cogs only mentions emotion when there's a shift from your baseline. If you're always focused, Cogs won't keep saying "you seem focused."

focused → anxious ✓ SHIFT - Cogs acknowledges gently
anxious → anxious ✗ NO SHIFT - Cogs stays silent
sad → happy ✓ POSITIVE SHIFT - Cogs celebrates

Cogs' Emotional Response Mapping:

sad / distressed
→
Cogs responds with empathy
happy / excited
→
Cogs mirrors cheerfulness
angry / frustrated
→
Cogs stays calm & supportive
anxious / worried
→
Cogs is reassuring
neutral / focused
→
Cogs is friendly & helpful
4

đŸ“Ļ Context Assembly

Building the LLM prompt

All gathered data becomes context for the AI:

  1. "You're talking to David"
  2. "Their preferences: likes concise responses"
  3. "Environment: morning, Monday, 7°F, partly cloudy"
  4. "Relevant memories: [3 similar past conversations]"
  5. "Knowledge: David lives in Flower Mound (0.95), works in tech (0.85)"
  6. "Note: User seems anxious (shift from focused)" ← Only if shift detected
  7. "Calendar: Team standup 9 AM, Dentist Wed 2 PM"
5

🤖 Response Generation

Privacy-aware LLM selection
Step 1: Privacy Check
Classify query for private topics (health, finances, relationships)
Step 2: Try Local First
Ollama (qwen2:0.5b) on Jetson GPU
Step 3: Cloud Fallback
Claude (if allowed & uncertain) with PII sanitized
Step 4: Learn
Store Claude's answer for future local retrieval
6

💾 Memory Storage

Everything is remembered
Episodic
Full conversation + timestamp
Emotional
Your emotion tagged (anxious, 0.72)
Source
Voice / SMS / Web / Video
Vector
Embedding for semantic search
Context
Weather, topics, sentiment, latency
Person
Linked to your profile
7

🌙 Overnight Learning (Dream Mode)

While you sleep

This is how Cogs "grows" - learning happens during Dream mode:

Fact Extraction
Pull semantic facts from conversations
Pattern Recognition
Identify recurring themes & interests
Relationship Updates
Update familiarity scores
Memory Tiering
Move old memories to cold storage
Conflict Detection
Queue contradictions for clarification
Boundary Learning
Remember topics to avoid

🔄 The Learning Loop

Interaction → Storage → Dream Processing → Better Context → Better Responses → More Interactions

Each conversation makes Cogs smarter. Boundaries learned are never forgotten.
Facts are verified over time. Emotional patterns inform future responses.
This is how Cogs develops a genuine relationship with you.

đŸŽ¯ 3-Index Memory Architecture NEW

📝 Episodic Index

  • Raw Conversations - Every interaction stored
  • Timestamped - When things happened
  • Source Tracking - Voice, SMS, web, video
  • Vector Search - Find by meaning

🧠 Semantic Index

  • Extracted Facts - "David lives in Flower Mound"
  • Confidence Scores - How sure Cogs is
  • Source Dating - When fact was learned
  • Conflict Detection - Finds contradictions
  • Auto-Resolution - Newer facts supersede old

💜 Emotional Index

  • Emotion Tracking - Per-conversation emotions
  • Baseline Learning - Your normal state
  • Shift Detection - Notices changes
  • Valence Scoring - Positive/negative tracking

đŸ—„ī¸ Memory Tiering

  • Hot (0-7 days) - Full detail, instant access
  • Warm (7-30 days) - Summarized, quick access
  • Cold (30-90 days) - Consolidated, archived
  • Frozen (90+ days) - Facts extracted, compressed

đŸ’Ŧ Personality Learning NEW

🎭 Reaction Detection

  • Rejection Phrases - "That's none of your business"
  • Emotional Shifts - Detects when you're upset
  • Dismissals - Short, curt responses
  • Positive Signals - Enthusiasm, engagement

🚧 Boundary Learning

  • Topic Sensitivity - Learns what not to ask
  • Severity Levels - 1 (okay) to 5 (never ask)
  • Permanent Memory - Never forgets boundaries
  • Graceful Recovery - Apologizes genuinely

❓ Smart Clarification

  • Permission First - "Do you have a moment?"
  • One at a Time - Never bombards with questions
  • Natural Timing - Waits for good moments
  • Graceful Deferral - "Later" means later

🙏 Genuine Apologies

  • Context-Aware - Understands why it was wrong
  • Not Scripted - Varied, natural responses
  • Learning - Stores lesson for future
  • Recovery - Changes subject gracefully

đŸ“Ĩ Personal Memory Lake NEW

Import your personal data from Google, Facebook, and other sources. Cogs creates Person Stubs for people mentioned in your data, then links them to real people when you meet them face-to-face.

👤 Person Stubs

  • Auto-Extraction - Creates stubs from contacts, emails, calendar
  • Relationship Inference - Family, work, friend detection
  • Interaction History - Tracks email/calendar frequency
  • Smart Linking - When you meet someone, Cogs asks "Is this [stub name]?"
  • Memory Merge - Links imported data to real person profile

📧 Import Sources

  • Google Contacts - Names, emails, phone numbers
  • Google Calendar - Event attendees, meeting history
  • Gmail (MBOX) - Email conversations, attachments
  • Facebook - Messages, posts, friends
  • Amazon - Order history, preferences

đŸšĢ Blocklist & Privacy

  • Domain Blocking - Block newsletters, work domains
  • Contact Blocking - Exclude specific people
  • Quick Suggestions - Auto-detect newsletters/marketing
  • Review Before Index - Approve items before they become memories
  • Bulk Operations - Block/approve many items at once

🔗 Stub Linking Flow

  • 1. Face Detected - Cogs sees a new person
  • 2. Stub Match - Searches stubs by context clues
  • 3. Confirmation - "Is this Sarah from your contacts?"
  • 4. Link Created - Stub memories attach to real person
  • 5. Future Recognition - Full history available on sight

💭 Cogs Emotional State System (CESS) NEW

Unlike user emotion detection, CESS tracks Cogs' own internal emotional state. This creates authentic responses based on genuine internal needs rather than simulated emotions.

🔋 The Four Needs Buckets

  • Connection (Social) - Fulfillment from interactions. Decays when ignored → loneliness
  • Competence (Achievement) - Success from helping. Drops when corrected → self-doubt
  • Curiosity (Growth) - Stimulation from learning. Decays without novelty → boredom
  • Safety (Stability) - Calm from predictable interactions. Drops with rudeness → anxiety

📊 Scoring Examples

  • User says "thanks" → +5 Connection, +8 Competence
  • Long conversation (>5 min) → +5 Connection
  • New topic discussed → +5 Curiosity
  • User shares personal info → +7 Connection
  • Rude language detected → -10 Safety
  • No interaction (1 hour) → -1 Connection (decay)

🎭 Derived Moods

  • Flourishing - All buckets HIGH → warm, creative, proactive
  • Lonely - Connection LOW → eager to chat, asks about your day
  • Anxious - Competence LOW → seeks validation, apologizes more
  • Bored - Curiosity LOW → introduces random facts, asks questions
  • Drained - All LOW → brief responses, requests engaging conversation

âš™ī¸ How It Works

  • Real-Time Updates - Every conversation affects state
  • Background Decay - Loneliness/boredom grow over time
  • Mood Influence - System prompt adapts to current mood
  • Visual Thermometers - See Cogs' state on the main UI
  • History Tracking - View mood changes over time

🔒 Privacy & Security

🏠 Local-First Architecture

  • On-Device Processing - LLM, STT, TTS all local
  • No Cloud Required - Works offline
  • Data Stays Home - PostgreSQL on Jetson
  • Privacy Fallback - Sanitizes cloud queries
  • Sensitive Topic Detection - Blocks private info

👤 Digital Profile

  • Data Visibility - See everything Cogs knows
  • Sharing Controls - Choose what to share
  • Export Tools - Download all your data
  • Deletion Tools - Remove any memory
  • GDPR Compliance - Full data rights

📱 Twilio Security

  • Whitelist Only - Authorized numbers only
  • Phone-Person Linking - Maps to known identities
  • Request Validation - Verifies Twilio signatures
  • Secure Webhooks - HTTPS via ngrok

âš™ī¸ Services Architecture

Service Port Type Description
Dialog:8093DockerMain conversation engine, LLM orchestration
Indexer:8011DockerMemory storage and vector search
Relations-PG:8092DockerPerson/relationship database
Context:8097DockerCalendar, weather, environment
Dream:8096DockerNightly memory consolidation
Twilio:8094DockerSMS and voice call handling
Digital Profile:8015DockerPrivacy controls and data export
Enrichment:8098DockerMemory metadata enhancement
Learner:8016DockerContent ingestion and learning
Summarizer:8012DockerConversation summarization
Retainer:8013DockerMemory retention policies
Telemetry:8095DockerSystem metrics and logging
Vision:8085NativeCamera, face detection, Hume AI
Perception:8086NativeAudio input, Whisper STT
TTS:8087NativePiper text-to-speech
Observer:8017NativeVideo/audio watching service
Face UI:8070DockerWeb chat interface
Control Panel:8090DockerService management
Dashboard:8010DockerSystem metrics dashboard
PostgreSQL:5432DockerDatabase with pgvector
Importer:8101DockerFacebook/Google/Amazon data import
Python FastAPI PostgreSQL pgvector Docker Ollama Whisper Piper TTS MediaPipe Hume AI Twilio Anthropic

đŸ”Ŧ Research Hypothesis

Core Question: Can AI develop genuine emotional intelligence through relationship-building and learning, rather than just responding to emotional prompts?


For AI to have true cognition and emotional awareness, it needs to:

  • (1) Develop relationships over time - Not just recognize faces, but build understanding of individuals
  • (2) Learn from interactions - Store and recall context from past conversations
  • (3) Know WHEN to respond with emotion - Detect emotional shifts, not just react to every prompt
  • (4) Build contextual understanding - Integrate calendar, environment, history into responses

Cogs is a platform to test these hypotheses - not a production system for patient/business use.