Async state persistence eliminates blocking I/O on every registry write. WebSocket broadcasts now send delta patches instead of full state — massive bandwidth reduction for large teams. Registry I/O debounced, stream broadcasts throttled, stdout buffers capped. Race condition in async timeline writes fixed. Stream throttle properly cleaned up on agent exit.
Desktop app crashed on Windows due to platform-specific path handling. Fixed path resolution across macOS, Windows, and Linux.
OAuth redirect URIs were malformed after the desktop app refactor. Fixed redirect handling and a regression in daemon-bridge argument passing.
OAuth now opens in the system browser instead of Electron's embedded webview — fixes Google's embedded browser block. Google sessions persist across app restarts. GUI auto-refreshes when OAuth completes in the external browser. Electron memory pressure reduced.
Fixed electron-updater CJS/ESM import error. Moved electron-updater from devDependencies to dependencies (was missing at runtime). Removed stale Three.js references from Vite config that broke GUI builds. Added test:local script for fast local DMG builds without codesigning.
GROOVE is now a native desktop app. Electron wrapper with single-instance lock, proper error handling, macOS traffic light positioning. Two editions: Community (free, full orchestration) and Pro (paid, advanced features). Runtime edition gating with server-side enforcement. Subscription system with auth flow.
Desktop app opens to a home screen showing your recent projects. Click to open any project directly. No more navigating to directories manually.
Electron app supports multiple project workspaces. Switch between projects without restarting. Each workspace gets its own daemon state. Onboarding includes a folder selection step.
Full daemon-to-daemon federation: whitelist-based pairing, knock protocol for connection establishment, persistent WebSocket connections between daemons, ambassador agents that represent remote teams. QC review for federated operations.
New Toys feature: an interactive API playground with a curated catalog of fun and useful API experiments. Test endpoints, explore responses, learn the API hands-on.
Import GitHub repos directly from the GUI via POST /api/repos/import. Specify a repo URL and target path — GROOVE clones it and optionally creates a team. Path resolution and nuke flow improvements included.
Agents can now have configurable personalities that influence their chat style and role behavior. Chat role support for more natural agent interactions. Dashboard refinements included.
Create custom files that get injected into agent context. Define project-specific instructions, coding standards, or architectural constraints that every agent receives.
Approvals now auto-retry after approval (agents don't need manual restart). New approval modal in the GUI with clear approve/reject actions. Better approval messaging shows what the agent wants to do and why.
Logo moved to sidebar. Project badge is now clickable. macOS traffic light buttons properly positioned. Activity bar icons resized to 18px. Command palette placeholder text refined.
Discoveries and handoffs were firing even when agents had no task assigned, creating noise. Now gated on hasTask. Constraint auto-extraction disabled (was too aggressive). Discovery relatedness validation added.
XSS vulnerability fixed in agent output rendering. Provider detection improved. Onboarding wizard bugs resolved. Agent tree rendering improvements. Edition system hardening with proper auth callback handling.
The runaway_velocity safety trigger added in v0.27.0 produced false positives on legitimate heavy exploration. A planner reading a real codebase could burn 2M tokens in 5 minutes doing its job — not a runaway, just normal work. v0.27.1 fixed it for planners via role multipliers; v0.27.2 removes the trigger entirely rather than keep papering over the root issue. Velocity alone is a bad stuck-loop signal. If real usage shows a need for earlier warning, it'll come back gated on quality signals (repetitions, errors, file churn) — not velocity alone.
Per-instance ceiling (default 5M tokens) with role multipliers — planner 10× (50M), analyst 5× (25M), fullstack/security 4× (20M), everyone else 1× (5M). Catches genuinely runaway agents without tripping on heavy legitimate work. Override per-role via safety.roleMultipliers.
safety.velocityWindowSeconds and safety.velocityTokenThreshold no longer have any effect. Remove them from .groove/config.json.
Handoff briefs instruct agents to continue the conversation naturally (no "resuming" announcements). User's last 5 messages carry forward so new instances know what was being asked. Chat history migrates unconditionally on rotation (not just when the old agent's panel was open). Rotation toasts silenced — sessions feel truly seamless.
New MemoryStore module persists what agents learn across sessions, rotations, and restarts. Four file types in .groove/memory/: project-constraints.md (discovered project rules, capped at 50), handoff-chain/<role>.md (last 10 rotation briefs per role, newest first), agent-discoveries.jsonl (success-only error→fix pairs, deduped, 1000 cap), and agent-specializations.json (per-agent + per-role quality profiles). The Introducer injects accumulated memory into every new agent spawn — agent #50 knows what agent #1 learned. 9 new API endpoints at /api/memory/*.
Two new rotation reasons catch pathological agent behavior. token_limit_exceeded fires when an agent uses more than safety.tokenCeilingPerAgent (default 5M) in a single instance. runaway_velocity fires when token burn exceeds safety.velocityTokenThreshold (default 1.5M) inside a rolling safety.velocityWindowSeconds window (default 300s). Both trigger automatic kill + respawn with a journalist handoff brief — no user intervention. Triggers are scoped to the agent's spawn, so the new instance starts fresh and won't re-fire on inherited cumulative tokens. Safety bypasses the 5-min cooldown that regular rotations respect.
The knock protocol is no longer advisory. Agents now POST /api/coordination/declare with {agentId, operation, resources[]} to acquire an exclusive lock on shared resources (e.g., package.json, node_modules). Conflicts return 423 Locked with the owner's identity, operation name, and expiry time. POST /api/coordination/complete releases the lock. Operations auto-expire after 10 minutes to prevent deadlock if an agent crashes mid-work.
Previously invisible coordination overhead now tracked. Headless claude -p calls from the Journalist, Project Manager, task negotiator, planner, gateway Q&A, and agent Q&A are recorded under reserved agent IDs (__journalist__, __pm__, __negotiator__, __planner__, __gateway__, __agent_qa__). Exposed as internalOverhead in /api/tokens/summary, visible on the dashboard Intel panel's new GROOVE Overhead section with per-component breakdown.
New GET /api/tokens/by-team returns ranked team burn (teamId, teamName, agentCount, totalTokens, totalCostUsd, avgTokensPerAgent). Dashboard now has a Team Burn panel showing per-team bars — answers "which team used the most tokens?" at a glance.
New GET /api/agents/:id/routing/suggestion returns a lighter-model recommendation when the classifier has 40+ events and the agent's current tier exceeds the detected task complexity. Only suggests downshift — never silent escalation. User must accept via UI click; the heavy-defaults design principle is preserved. Returns 204 when no strong suggestion is available.
Classifier now emits classifier:update events every 20 events once it has 40+ data points, so the GUI can surface real-time task-tier updates and downshift opportunities during a session (previously classification only ran at spawn).
Dashboard's Intel Panel gains a fourth tab alongside Health/Adaptive/Journalist: Memory. Hero stats show constraints / discoveries / role chains / specializations counts. Below: full constraint list, last 8 discoveries with trigger + fix + role + timestamp, active role-chain tags, and per-role quality profiles.
QUALITY_THRESHOLD raised 40 → 55 (was hair-trigger on early sessions). MIN_EVENTS raised 10 → 30 (need ~100 turns of stable signal before scoring). New 5-minute cooldown prevents back-to-back churn on persistently low-quality tasks. Converged adaptive profiles get a deeper quality floor (threshold − 15) — if the threshold has settled, Groove trusts it.
Every rotation now captures preRotationVelocity at trigger time. After the new agent has 10 minutes of data, postRotationVelocity and velocityDelta are computed and stored in rotation history — foundation for replacing the hardcoded 30% savings assumption with real measurements over time.
Denominator was mathematically wrong — it included fresh input tokens that were never cache-eligible, artificially depressing the reported rate. Now reads / (reads + creation), which matches what the dashboard, API, and cache ring all report. Affects /api/tokens/summary.cacheHitRate and per-agent cache rates in /api/dashboard.
MCP integrations (GitHub, Stripe, Gmail, etc.) now work on every provider — not just Claude Code. The daemon acts as an MCP client, proxying JSON-RPC calls over stdio and exposing results as simple HTTP endpoints. Agents on Gemini, Codex, Ollama, or local models get playbook instructions injected at spawn time, teaching them to call integrations via POST /api/integrations/:id/exec. MCP servers are pooled (lazy start, 10-min idle timeout, crash recovery with 3 retries). Claude Code agents still use native .mcp.json — zero change for them.
New GET /api/integrations/:id/tools endpoint returns the full list of tools an integration provides, including names, descriptions, and input schemas. Agents can discover available capabilities at runtime.
High-impact tools now require human approval before executing. GitHub (create_issue, create_pull_request, delete_file), Stripe (create_charge, create_invoice), Gmail (send_email), Calendar (create_event), Linear, Notion, and Home Assistant all have gated tools. The exec endpoint returns 202 with an approvalId — approve in the GUI, then the agent retries. Read-only integrations have no gates.
Every integration exec call is logged to the append-only audit log with integration ID, tool name, parameters (truncated to 200 chars), and agent ID. Blocked and rate-limited calls get their own audit events (integration.exec.blocked, integration.exec.rate_limited).
Sliding-window rate limiter prevents runaway agents from flooding external APIs. Each integration gets a 30 calls/min budget shared across all agents. Exceeding the limit returns 429.
A complete agent loop that runs inside the daemon, talks to any OpenAI-compatible API (Ollama, llama-server, vLLM), and plugs into all GROOVE orchestration. Local models get 7 tools (read, write, edit, run commands, search files, grep, list dirs), automatic context rotation, journalist synthesis, team coordination, and token tracking. Zero cloud tokens — fully offline capable.
New Models tab in the activity bar. The Recommended tab detects your hardware and shows models that fit your RAM, sorted by quality. One-click Pull downloads via Ollama (auto-starts the server if needed). Installed models show green badges. Search tab browses HuggingFace for GGUF files with color-coded RAM indicators (green/yellow/red) and per-quantization download buttons.
Send messages to running local agents and get responses in real-time. Pulsing dot thinking indicator while the agent works. Full multi-turn conversation with maintained history. Same chat UX as cloud agents.
When you save an OpenAI API key in Settings, GROOVE automatically runs codex login --with-api-key behind the scenes. No manual terminal commands needed.
The store only rendered chat bubbles for Claude Code's event format. Gemini, Codex, and Ollama agents consumed tokens but text was silently dropped. Fixed — all providers now display output correctly.
GUI sends empty string for 'Auto' model selection, which wasn't triggering auto-routing. Fixed — null/empty model now routes correctly for all providers.
Switching to a model from a different provider in the config panel only changed the model, not the provider. Fixed — provider added to registry safe fields and sent in the PATCH request.
The old Ollama CLI provider is hidden from the UI (kept for backward compat). The spawn wizard only shows models you actually have installed, not a catalog of everything that would fail. Custom role input auto-sanitizes spaces to dashes.
Each team now gets its own auto-managed directory. Create a team, it creates the folder. Rename it, the folder renames. Delete it, the folder and all contents are cleaned up. Agents automatically inherit their team's working directory. Zero cross-contamination between teams.
New Files tab in the agent panel lists all markdown files from the agent's working directory — CLAUDE.md, AGENTS_REGISTRY.md, GROOVE_PROJECT_MAP.md, and anything in .groove/. Click to open an inline editor, edit, and save. Full customization of agent context without leaving the GUI.
First-time users get a guided setup: system check (Node, npm, git), interactive provider selector with arrow keys and space to toggle, API key prompts with masked input, and Claude Code auth guidance. Version number displayed in the welcome banner.
When an agent spawns, the chat shows a staged startup animation — lines appear sequentially (role, provider, scanning, active) with a spinning loader. Replaces the blank chat body during initialization.
Agent messages use a clean 1px teal left bar instead of bubbles. Tool calls cycle through an animated loader bar instead of cluttering the chat. Text streams progressively by appending to the last message. User messages in blue-tinted bubbles for clear visual separation.
Command Center now shows live KPIs (tokens, cost, cache rate, efficiency), token burn chart from timeline snapshots, agent fleet panel, savings breakdown (rotation, conflict prevention, cold-start), adaptive thresholds, and activity feed. Previously showed empty modules.
Agents spawned via recommended team launch were missing teamId — filtered out of the team-scoped view. Fixed in launch endpoint, gateway launch, phase 2 auto-spawn, and validateAgentConfig (was stripping teamId).
GUI was reading agent.contextUsed instead of agent.contextUsage. Fixed in agent-node, agent-panel, agent-telemetry, and fleet-panel. Also added real-time contextUsage updates from agent:output events.
Claude Code output parser was dropping text content when usage events came in the same chunk. Fixed merge priority: content-bearing events (result > activity) now take precedence over usage/session events.
Sending a message to a running agent was doing kill+respawn (instruct). Now uses non-destructive query for running agents. Only completed agents get resumed via instruct.
Agent nodes flew in from origin (0,0) on every render. Fixed by preserving existing node positions on data updates — only new nodes get calculated positions.
Team tabs redesigned as flat IDE-style tabs with 0.5px accent top line, agent count badges, and double-click to rename any team including Default. Pencil icon on hover as secondary rename entry point.
CodeMirror editor now scrolls properly (height:100% + overflow:auto). File tree has right-click context menu for New File, New Folder, Rename, and Delete. Toolbar buttons for quick file/folder creation.
Gateways now use a team-first interaction model. /instruct, /query, and /log accept team names — Groove routes your message to the team lead automatically using role priority (QC > fullstack > senior > PM). No need to remember agent IDs or names.
New /plan command spawns a planner agent that reads your codebase, designs a full team (roles, scopes, models, prompts), and sends the plan back to your chat for approval. Approve to launch the entire team with phased execution and auto QC. All from a single Slack/Telegram/Discord message.
/instruct <team> sends to team lead, /query <team> synthesizes status via journalist, /log <team> shows team logs, /plan <description> starts a new project, /brief shows journalist summary, /tokens shows usage and savings breakdown.
All agent-targeting commands now accept team names, agent names, name prefixes, and roles. Ambiguous matches prompt you to clarify. Kill and rotate use agent names, talk-to commands use team names.
Docs button in the status bar links to
docs.groovedev.ai. Every gateway setup guide sheet now links to the full documentation.
Manage your agents from anywhere. Native, first-class gateway integrations built into the daemon — not bolt-on MCP servers. Telegram uses zero external dependencies (raw Bot API via fetch). Discord uses discord.js with rich embeds and ActionRow buttons. Slack uses @slack/bolt in Socket Mode with Block Kit formatting. All connections are outbound-only — no firewall changes needed.
10 commands available from any gateway: /status, /agents, /spawn, /kill, /approve, /reject, /rotate, /teams, /schedules, /help. Commands call daemon methods directly (in-process, not HTTP). Inline approve/reject buttons on all three platforms for instant approval actions.
Gateways intercept daemon broadcasts and push notifications to your chat. Three presets: Critical (approvals + crashes), Lifecycle (all agent events), All (everything except raw output). 3-second event coalescing batches rapid events into single messages. Approval requests are never batched — each gets its own message with action buttons.
New Gateways section in Settings between Providers and Configuration. Add Telegram, Discord, or Slack with one click. Each gateway card shows connection status, bot token input (masked), notification preset selector (segmented control), command permission toggle (Full Access / Read Only), test button, and enable/disable toggle.
Each gateway can be set to Full Access (spawn, kill, approve, reject, rotate) or Read Only (status, agents, teams, schedules only). User allowlists restrict who can send commands — empty allowlist means open access for personal bots.
Full CRUD for gateways: GET/POST /api/gateways, GET/PATCH/DELETE /api/gateways/:id, plus /test, /connect, /disconnect, and credential management. Gateway status broadcasts to GUI via WebSocket.
New Settings view accessible from the activity bar. Provider cards show connection status (Ready/No key/Not installed), API key management with masked input and show/hide toggle, and model lists. Configuration cards for Default Provider, Working Directory (with folder browser), Rotation Threshold, QC Threshold, Max Agents, and Journalist Interval — all with segmented controls that auto-save.
Dedicated Ollama card with hardware detection (GPU/CPU, VRAM), model catalog with recommended models for your hardware, one-click pull with real-time progress streaming via WebSocket, and installed model management. Pull progress shows download percentage inline.
Sign in to the Skills Marketplace from the Settings header. Avatar, display name, and sign out button. Authentication state persisted across sessions. Marketplace auth wired to the skills API for ratings and future paid skills.
The entire GUI has been rebuilt with a VS Code-inspired layout. Tailwind CSS v4 replaces all inline styles. Radix UI provides accessible primitives. 48px activity bar with 5 navigation icons (Agents, Editor, Dashboard, Marketplace, Teams) plus 2 utility icons (Settings, Terminal). Resizable panels via drag handles. Breadcrumb bar with Cmd+K command palette and pill-style spawn button. Status bar with connection indicator and agent count.
Agent nodes slimmed to 200px with tighter padding. React Flow minimap added for large teams. Welcome onboarding state with illustration when no agents exist. Animated edges flow while agents are running. Node context menu on right-click.
Agent detail panel redesigned with bubble-style chat UI and mode pills (Instruct/Query/Continue). Stats tab uses canvas-rendered charts for performance. Controls include model hot-swap, rotate, and kill. Agent telemetry tab shows detailed session metrics.
Dashboard rebuilt with KPI strip featuring mini sparkline charts. Token/cost area chart with time axis. Fleet panel showing all agents with status. Savings breakdown with rotation, cold-start, and conflict prevention. Activity feed with real-time events.
Skills marketplace redesigned with card-based layout, category bar, search, star ratings, and integrations tab. MCP integration cards for Slack, Gmail, GitHub, Stripe, and more.
Rebuilt the entire GUI from scratch. Replaced the single-header tabbed layout with a VS Code-inspired architecture: 48px activity bar for navigation, resizable detail panel, bottom terminal panel, breadcrumb bar with Cmd+K command palette, and status bar. Tailwind CSS v4 with zero inline styles. Radix UI for accessible components. Zustand 5 for state management with WebSocket sync.
The GUI now has six distinct views: Agents (React Flow tree), Editor (CodeMirror 6 + file tree + terminal), Dashboard (KPI sparklines + charts), Marketplace (skills + integrations), Teams (management + approvals + schedules), and Settings (providers + config). Each view is a full page, not a tab.
New component library: Button, Badge, Input, Select, Dialog, Sheet, Tabs, Tooltip, DropdownMenu, ScrollArea, Card, Skeleton, StatusDot, Avatar, ContextMenu, Toast, Collapsible. All built on Radix UI with consistent styling.
Major backend cleanup: timeline tracker for historical metrics (30s intervals, max 2K snapshots), lifecycle events (spawn/complete/crash/kill/rotate, max 500), file watcher for editor integration, terminal PTY management, codebase indexer improvements. 59 GUI components, 50+ API endpoints.
Added hydration gate — UI waits for first WebSocket state message before rendering agent tree. Prevents flash of empty state on page load.
Complete redesign of agent spawning. Full-screen overlay with two-panel layout: configuration on the left (role, directory, permissions, effort, integrations, skills, schedule), and a full-height prompt editor or AI planning chat on the right. Plan with AI lets you discuss and refine tasks before spawning — responses are Groove-aware with system context. "Generate Prompt" synthesizes the conversation into a clean, actionable agent prompt.
Planners now generate teams with phase 1 (builders) and phase 2 (QC/finisher). Builder agents spawn immediately and work in parallel. When all builders complete, the QC Senior Dev agent auto-spawns to audit, integrate, test, commit, and launch. Safety net: if the planner omits the QC agent, one is auto-added. Nudge for manual users when agents finish without QC.
Cleaner header: navigation aligned left, Integrations/Skills/Schedules/Journalist collapsed into a dropdown menu. Agent count removed. Active state uses text color only — no bottom borders.
Agent nodes now support smooth drag and drop with positions persisted to localStorage. Nodes stay where you put them across refreshes. Positions keyed by agent name so they survive respawns. Smart edge routing: connections snap to the nearest side (top/bottom/left/right) as you move nodes. Default zoom pulled back for breathing room.
Optional name field when spawning agents (e.g., 'Skills Developer' instead of 'fullstack-3'). Rename agents anytime from the Actions tab. Effort level selector: Low, Medium, High, Max — maps directly to Claude's --effort flag.
Toggle 'Recurring schedule' when spawning to auto-respawn agents on a cron. Frequency presets from every 30 minutes to monthly. Agent spawns once immediately and creates the schedule.
Set a default working directory that applies to all new agents. System-level directory browser lets you navigate anywhere on disk. 'Set as default for all agents' button. Per-agent override still available.
Enter an Anthropic API key in the spawn panel for sub-second plan chat responses via direct API call. Falls back to CLI for subscription auth. Key encrypted and stored locally.
Agent chat history and activity logs persist to localStorage across page refreshes. Chat carries over when agents are resumed or rotated. Clear button to reset conversations.
Fixed dead 'Connect with Google' button. Error handling on all integration operations. Bundled Google OAuth credential support. Removed redundant Filesystem and SQLite integrations.
Directory bar at the top of the file editor — click to type any absolute path and the file tree reloads to that directory. Browse any project without restarting the daemon.
Automatic cleanup runs daily: deletes log files older than 7 days, prunes stale plan artifacts after 24 hours, trims audit log to last 1000 entries. Stale recommended-team.json files auto-cleaned when a new planner spawns.
Port retry on startup — waits up to 5 seconds for the port to release before falling back to another. Fixes port-hopping after groove stop && groove start.
VS Code-style file editor embedded in the GUI. CodeMirror 6 with One Dark theme, language detection for 20+ languages, line numbers, bracket matching, search, and auto-indent. Cmd+S saves to disk. New Editor tab in the header (second position after Agents). File tree sidebar (240px, expandable, lazy-loaded) with colored dots by language.
Right-click any file or folder in the tree for a context menu: New File, New Folder, Rename, Delete. Inline text input for naming — press Enter to confirm, Escape to cancel. Toolbar with SVG file and folder icons for quick creation at root level. All operations audit-logged.
Click an image (PNG, JPG, GIF, SVG, WebP, AVIF) to see it rendered on a checkerboard background. Click a video (MP4, WebM, MOV) to play with native controls. "Open in new tab" link for full-size viewing. Served via GET /api/files/raw with correct Content-Type headers (50MB limit).
Click the Terminal bar at the bottom of the editor to open an interactive shell. Uses Python's pty module to allocate a real pseudo-terminal — full prompt, colors, line editing, tab completion, Ctrl+C. xterm.js frontend with One Dark colors. Drag the resize handle to adjust height. Sessions auto-clean on disconnect. Requires Python 3.
When you have a file open in the editor and an agent modifies it, an amber banner appears: "File modified externally" with Reload and Dismiss buttons. Uses Node.js fs.watch with 300ms debounce. Self-saves are suppressed (no false alerts when you Cmd+S).
Open multiple files in tabs. Active tab has a 1px accent border and dark background. Dirty files show an amber dot. Close button on each tab. Tooltip shows full path. Tab state preserved while switching between files.
Backend endpoints for the editor: GET /api/files/tree, GET /api/files/read, POST /api/files/write, POST /api/files/create, POST /api/files/mkdir, DELETE /api/files/delete, POST /api/files/rename, GET /api/files/raw. All reuse path validation (no traversal, no absolute paths, project-scoped). Binary detection on read. 5MB limit on text, 50MB on raw.
Complete redesign of the skills marketplace into an app-store experience. Featured banner with curated skills, 3-column card grid, full detail modal with author profiles, ratings, download counts, and install/uninstall. Browse and My Skills tabs. Sort by popularity, rating, newest, or A-Z. Search across names, descriptions, and tags. Category filters with icons.
Installed skills can be attached to agents. Select skills in the spawn panel before launching, or attach/detach from the Actions tab on running agents. Attached skills are injected into the agent's context via the introducer — the agent receives the full SKILL.md instructions automatically. Clone and restart preserve attached skills.
Rate skills 1-5 stars directly from the detail modal. Interactive hover effects on stars, immediate UI feedback. Ratings are proxied through the daemon to the skills server. Aggregate scores and rating counts displayed on cards and in the detail view.
Three verification tiers: Anthropic (gold), Groove (teal), and Verified (green) for community creators who complete verification. Badges displayed on skill cards, featured cards, and prominently in the detail modal. Builds trust in the marketplace.
The daemon now fetches skills from the live API at docs.groovedev.ai/api/v1/skills instead of a static JSON file. Skill content downloaded from /api/v1/skills/:id/content. Install tracking via POST /api/v1/skills/:id/install. Offline fallback to bundled registry. No API key needed — the daemon is a read-only consumer.
Skills now include downloads, rating, ratingCount, price, featured, verified, and authorProfile (avatar, website, GitHub, Twitter/X). Ready for paid skills and community submissions.
groove connect user@host creates an encrypted SSH tunnel to a remote GROOVE daemon and opens the GUI in your browser. groove disconnect closes it. SSH config aliases work (groove connect my-vps). Preflight checks verify the daemon is running. Port collision detection auto-picks an open port. PID verification prevents killing unrelated processes.
groove start --host tailscale auto-detects your Tailscale IP and binds the daemon to it. --host <ip> for explicit IPs. --host 0.0.0.0 is rejected by design — direct internet exposure is not supported. CORS and WebSocket origins updated to allow the bound interface.
Append-only .groove/audit.log with 0600 permissions. 14 audited actions: agent spawn/kill/rotate/instruct, team save/load/delete/import/launch, credential set/delete, config changes, approval decisions. groove audit CLI with color-coded output. GET /api/audit endpoint. Auto-rotates at 5MB.
Connect GROOVE daemons with groove federation pair <host>. Each daemon generates an Ed25519 keypair. Cross-server contracts are signed and verified. Replay protection (5-minute window). Invalid keys rejected at pairing time. Peer IDs restricted to hex (path traversal prevention). Full audit trail on both sides.
GROOVE detects VS Code Remote, plain SSH, and headless servers. VS Code users see "VS Code forwards this port automatically." SSH users see the exact groove connect command with their server IP. Local users see the GUI URL. No more confusing localhost links on remote servers.
SSH tunnels use different ports (31416, 31417). The previous exact-port check rejected WebSocket connections through tunnels, showing "offline" in the GUI. Now allows any localhost origin regardless of port.
The bottom-left status pill now detects when the GUI is accessed through a tunnel (port mismatch with daemon) and shows the connection type. Host badge appears in the header for Tailscale/LAN connections.
When you send a message to any agent, GROOVE now uses --resume SESSION_ID instead of killing and respawning with a handoff brief. Full conversation history preserved, zero token waste. Rotation with handoff brief only happens on degradation signals (the intended behavior).
All three providers now spawn, execute, and complete correctly. Codex uses codex exec --dangerously-bypass-approvals-and-sandbox. Gemini uses --yolo for tool auto-approval. API keys injected from credential store into agent environment. Smoke tested end-to-end: Gemini planner → Codex backend → Claude Code fullstack.
Codex and Gemini output plain text (not stream-json), so token counts were always zero. Now estimates tokens from output length (~4 chars/token). Dashboard, node heartbeats, and telemetry charts show data for all providers.
If port 31415 is taken (VS Code, other apps), the daemon tries the next 10 ports automatically. Writes .groove/daemon.port so CLI commands find the right port. Users never see port errors.
Gemini: 3.1 Pro, 3 Flash, 3.1 Flash Lite, 2.5 Pro, 2.5 Flash. Codex: o3, o4-mini, GPT-4.1 family. Removed non-existent models.
Running agent nodes show a live mini line chart of token usage (last 40 data points) instead of the old neural flow animation. Shows actual intelligent activity.
Connected stat bar (5 cards, no gaps), large area chart + donut ring for model routing, horizontal bars for savings and agent fleet. Savings methodology shown so users understand how estimates are calculated.
Gemini CLI in headless mode needed --yolo to auto-approve tool calls. Without it, agents would output code as text instead of creating files.
Codex changed from --approval-mode full-auto to codex exec --dangerously-bypass-approvals-and-sandbox. Also requires exec subcommand for headless operation.
Aider wraps the same APIs that Claude Code, Codex, Gemini, and Ollama access directly. Removed to simplify the provider set. Four providers remain.
Agent messages now stream in character by character with a pulsing cursor, like a live AI chat interface. Historical messages render instantly. Only the latest message animates.
The TokenTracker's record(), recordRotation(), and recordColdStartSkipped() methods are now called from the process manager and rotator. Savings data (rotation, conflict prevention, cold-start) is real, persisted to .groove/tokens.json, and displayed in the Command Center gauges.
Telemetry data moved from React component state to the Zustand store. Switching between Agents and Stats tabs no longer wipes the chart. Historical data for completed agents is preserved.
The Avg Context gauge showed 0 when all agents finished (only measured running agents). Replaced with Peak Context — shows the highest context usage reached across all agents, giving a meaningful metric even after a session ends.
The planner's recommended fullstack agent now includes instructions to audit all agents' work, fix issues, build the project, launch the dev server, and output the localhost URL. No more manually asking where the app is.
The WebSocket rotation:complete event was switching the panel to the new agent before the HTTP response could copy chat history. Now chat and timeline are copied atomically in the WebSocket handler before the panel switches.
Previously all zeros because the TokenTracker was never fed. With token tracking wired, the Efficiency gauge, Savings breakdown, Estimated Savings dollar amount, and Routing data all display actual values.
After the planner finishes, it writes a structured team config to .groove/recommended-team.json. A Launch Team button appears in the chat showing each recommended agent with its task. One click spawns the entire team with proper roles, scopes, and prompts. Non-technical users just describe what they want, and the planner handles the rest.
Nodes redesigned with rounded corners, colored role badges (planner=purple, backend=teal, frontend=amber, fullstack=green), circular connection ports, and minimal clean layout. Edges are now smooth Bezier curves (splines) instead of angular steps — the spline is the brand. Matching the visual style of Unity and n8n node editors.
The hero row now features SVG semicircle arc gauges for Efficiency %, Average Context, Agents, and Rotations — inspired by Grafana dashboards. Gauges animate smoothly on value changes. All dashboard panels have rounded corners.
Agent messages in the chat now render markdown properly — bold text, inline code, headers, lists, numbered lists, and horizontal rules. No more raw ** and ### in the output.
The planner prompt now instructs it to suggest specific agents with roles, scopes, and tasks. It notes that a fullstack agent is needed for root-level setup in Auto permission mode. Output uses clean plain text formatting.
Planning was routed to Haiku (light tier) which produced shallow one-line responses. Now routes to the heavy tier since planning is foundational work that needs deep reasoning and thorough codebase research.
The telemetry chart now builds its timeline from dashboard API polls (every 4s) instead of the WebSocket store. Per-agent area chart with thin flat lines, subtle gradient fills, time axis, and y-axis labels. Agent legend shows real-time token counts.
Removed all glow effects, text shadows, and neon styling across the entire UI. All lines 1px thin, colors flat, gradients subtle. Matches the dark terminal aesthetic of the GROOVE website.
Chat messages and token timeline now carry from the old agent to the new agent when rotating or continuing. No more cleared chat when sending a message to a completed agent.
The validateAgentConfig() function stripped the permission field from the config, so the PM review instructions never reached agents. Now preserved through validation, registry, and rotation.
The TokenTracker was never fed data from the process manager. The dashboard now uses the registry's actual tokensUsed values as the source of truth. TOKENS USED, TOKENS SAVED, and dollar amounts now display correctly.
When a stdout chunk contained multiple JSON events (assistant + usage), only the last event was returned, losing token counts. Now merges token totals across all events in a chunk.
The welcome message now displays on every groove start, not just first run. Shows provider scan, GUI URL, docs link, and GitHub. Box alignment fixed. CLI quick start commands removed (GUI is primary).
GROOVE now has a built-in AI Project Manager that reviews risky agent operations before they happen. When agents run in Auto mode, they call the PM before creating new files, deleting files, or modifying config. The PM checks scope, plan alignment, and team conflicts — then approves or rejects with reasoning. All reviews are logged in the new PM Review Log (formerly Approvals tab). This turns permission levels from decorative to functional.
Replaced the three-tier permission system (Always Ask / Auto / Full Access) with two practical modes. Auto: AI PM reviews risky operations (~10-15s overhead per review). Full Send: no reviews, maximum speed. No more false promises of interactive approval in a headless system.
The stats dashboard is completely rebuilt. Thin vertical bar charts for token burn rate and savings accumulation. Hero row with glowing dollar savings display. Agent fleet cards with context gauges. Adaptive threshold sparklines with session quality scores and signal breakdown (errors, repetitions, file churn). Journalist panel shows live AI synthesis summary. Routing viz now shows cost-per-tier. Feels like mission control, not a spreadsheet.
Pressing Enter in the chat input was firing both the keydown handler and an implicit form submission, causing double-fire. Fixed with type="button" on the Send button and stopPropagation on the Enter handler.
Previously, when an agent was rotated or continued, the new agent started with tokensUsed: 0. Now the rotator carries cumulative token stats to the new agent, so the dashboard shows lifetime totals.
The Gemini provider was showing Claude Code's install command. Fixed to show the correct npm i -g @google/gemini-cli.
Agents that have completed their task can now be continued from the chat. Type a reply, hit Send, and GROOVE generates a handoff brief from the agent's work, appends your message, and spawns a fresh session with full context. No more "Respawn" button creating disconnected agents. The conversation is seamless.
When you spawn a backend agent after a planner finishes, the backend now sees: what files the planner created, the planner's final output summary, and an explicit instruction to read relevant files before starting work. Completed agents are no longer invisible — they stay in the team context with their files and results.
Spawning a second backend? GROOVE now queries the existing backend's work via a headless Claude call, then gives the new agent a clear task assignment: what to work on, what files to avoid. The Task Assignment section in the intro context keeps duplicate-role agents from stepping on each other.
Before restarting the server, running npm install, or modifying shared config, agents now check .groove/coordination.md for active operations. They write their intent, check for conflicts, then proceed. File-based coordination that works with any CLI tool.
Claude Code's auto-memory system can bleed context from previous sessions into GROOVE agents. Now every agent gets explicit instructions to ignore auto-memory and rely only on GROOVE-provided context (handoff briefs, project map, registry). Two-layer enforcement: agent intro context + CLAUDE.md injection.
The planner's "do NOT implement code" constraint was only applied in the GUI spawn panel. Now ROLE_PROMPTS are applied in the process manager during spawn — consistent regardless of whether agents are spawned from GUI, CLI, API, or chat continuation.
The Journalist now triggers a synthesis cycle immediately when any agent completes — no more waiting up to 120 seconds for a stale project map. Synthesis input budget doubled to 40K chars. Text truncation increased from 500 to 2,000 chars. Result preservation increased from 500 to 3,000 chars. Handoff brief project map limit increased from 3K to 10K chars. Explicit Haiku model for predictable cost.
The rotation scoring system now detects file churn (same file written 3+ times = circular refactoring), error trends (increasing errors in the second half of a session = degradation signal), and tool failures (previously tracked but never extracted). Classifier window increased from 10 to 50 events for better signal quality.
Context usage was hardcoded to 200K for all models. Now uses actual model metadata: Opus has a 1M context window, Sonnet and Haiku have 200K. Rotation thresholds are now meaningful across different models.
Port 3141 conflicted with VS Code dev servers and other common tools. Moved to 31415 — still pi, but in the ephemeral range where nothing else lives.
A full-screen analytics dashboard showing token usage, savings breakdown, rotation timeline, model routing distribution, adaptive threshold profiles, and journalist status. The stats tab in the GUI now shows real-time operational data.
The model routing system is now fully connected. Agents spawned with model: 'auto' get routed to the right tier (Opus for heavy tasks, Sonnet for medium, Haiku for light) based on role hints and classifier signals. Cost tracking logs every routing decision.
Major cleanup pass across the codebase. All 14 test suites (137 tests) passing. Security audit complete: AES-256-GCM credentials, CORS localhost-only, input validation on all endpoints, prototype pollution protection.
Complete GUI redesign with One Dark Pro theme, JetBrains Mono font, and a terminal-native aesthetic. The GUI is served by the daemon — open http://localhost:31415 in any browser.
Click any agent to open a 45% sidebar with three tabs. Chat: instruct agents or query them without disrupting their work. Stats: live token heartbeat chart, context usage gauge, activity sparkline. Actions: rotate, stop, delete, clone, change model, modify prompt.
The spawn panel offers role presets (Backend, Frontend, Fullstack, Planner, Testing, DevOps, Docs), permission levels, provider selection with inline API key entry, and model selector. One-click spawning with intelligent defaults.
Send instructions to running agents via POST /api/agents/:id/instruct (rotates with new context) or ask questions via POST /api/agents/:id/query (headless one-shot, agent keeps running). Chat tab in the GUI uses both.
Agents are displayed as nodes in a React Flow graph with animated edges for running agents. Click to select, drag to rearrange. NASA-style terminal readouts with status bars, heartbeat indicators, and compact metrics.
GROOVE ships with a persistent daemon that manages all agent coordination. It holds the registry, broadcasts state over WebSocket, and exposes a REST API. State persists to disk, so if the daemon restarts, your agents don't vanish.
The full command set: groove start, groove stop, groove spawn, groove kill, groove agents, groove status, groove nuke, groove rotate, team management (save/load/list/delete/export/import), groove providers, groove set-key, and config commands.
When you spawn agent #2, the daemon writes AGENTS_REGISTRY.md and gives every agent context about its peers — who they are, what files they own, what they're working on. No more two agents silently editing the same file.
Spawn with a scope pattern and that agent owns those files. Other agents are told to stay out. Advisory locks for v0.1 — agents are informed, not blocked.
A background synthesis engine that watches all agent activity, filters noise, and generates GROOVE_PROJECT_MAP.md — a living snapshot of the entire project. New agents read one file and know everything. No more wasting thousands of tokens re-explaining your codebase.
Day 1 support for Claude Code (subscription), Codex (API key), Gemini CLI (API key), and Ollama (local). The provider abstraction makes adding new tools straightforward.