Changelog

What changed, what was fixed, what's new — newest at the top

v0.20.0-alpha.16 — May 2026

Headline

Lumiere milestone. Free plan, metered voice, and a deep-reasoning Max tier. Signed and Apple-notarized. Carries the alpha.15 quiet-when-busy improvements and the floating-promise storm fix.

New

Free plan, for real. Start using Clippy with just your email — no card, no trial. Chat + screen-aware tips, free forever with a monthly usage meter in Settings.
In-character upgrade nudge. When you hit the free monthly limit or request a paid feature, Clippy offers to unlock it — one friendly nudge, never nagging.
Premium voice (Max). Metered high-quality text-to-speech on the Max plan. Free and Power retain the built-in system voice.
Deep-reasoning Max brain. Max plan runs at higher OpenAI reasoning effort — more thorough responses on complex multi-step tasks.
Signed + Apple-notarized. macOS builds pass Gatekeeper without any right-click dance. Windows builds continue to use Azure Trusted Signing.

Fixed

Floating-promise storm. Un-caught void tick() in the window-follow loop + setBounds NaN produced up to 18.5K rejection events over 29 hours. The handler now logs stacks and rate-limits to one warn per minute.
"Start trial" button now opens the correct Power plan (7-day free trial) — it previously pointed at a retired plan.
Settings now shows the real plans (Free / Power / Max) with correct limits.

Plans

Free $0 — chat + screen-aware tips, forever.
Power $19.99/month — full desktop & browser automation, 5M tokens, 7-day free trial.
Max $39.99/month — everything in Power + premium voice, deep-reasoning brain, 20M tokens. No trial.

v0.11.27 — May 7, 2026

Headline

Audit + safety-net release. Six parallel subagent reviews (logs, architecture, code quality, UX, test coverage, perf) drove the work — strong convergence on three priorities: stop the model from retrying known-broken paths, make the existing fixes verifiable in CI, and clean up legacy cruft.

Fixed

CDP do-not-retry guidance. Most common failure pattern in recent reports (4 of last 5): model called cdp_connect, got ECONNREFUSED 127.0.0.1:9223, retried 3× in a row, runaway guard fired, task aborted. The error message now explicitly tells the model NOT to retry CDP and lists alternatives in priority order — use Outlook COM if available, fall back to UIA + OCR via smart_click, then navigate_browser, last-resort ask the user to relaunch their browser with debug-port enabled. Stops the runaway-guard burn-out on a known-failed path.
mouseDrag missing input sanitization. The rest of the mouse_* family used sanitizeNumber; mouseDrag was the holdout, raw-casting Number(params.startX) which silently passed NaN to PowerShell as text. Now consistent.
Two as any casts in brain.ts killed by adding provider to the TurnSuccess type. Better diagnostic logs because the field is now type-checked.
Memoized getScriptsDir. Called on every tool invocation; on a 40-step task that meant ~80 redundant fs.existsSync calls. Now computed once and cached.
Stale Gemini comment in brain.ts refreshed to reflect the current OpenAI-only provider.

Smoke harness expanded 14 → 24 tests

Most v0.11.22-26 fixes had zero automated coverage. Now every recent invariant is checked at smoke time:

The hallucination-guard regex (positive + negative cases including future-tense, "I'll send", "going to delete")
Structural invariants: every member of DESTRUCTIVE_TOOLS and UI_MODIFYING_TOOLS exists in TOOL_MAP (catches silent breakage on tool renames)
Screenshot downscale constants still in source (1024 / 1280)
All 5 OCR-failure-mode log strings present
abortAllInFlightTools still calls .abort() AND .clear() (sleep would silently break otherwise)
runComScript stdout-JSON extraction (pure-function test of the v0.11.26 fix)
No live Gemini code paths remain in brain.ts
TOOL_MAP tool-count sanity (currently 58)

Deferred

Two big-but-risky items from the audits, deliberately held until v0.11.28+: (a) tool-set rationalization (50 → ~28 tools + response templates for top-15 intents) — high-value but high-blast-radius in one ship; (b) extracting executeToolCalls + applyHallucinationGuard as private methods from handleUserMessage — readability win but no behavior change.

v0.11.26 — May 7, 2026

Headline

Two bugs that snuck into v0.11.25, both surfaced by a fresh user log report. Sleep now actually kills in-flight tools (the user's main complaint), and the buggy Outlook detection probe is gone — replaced with the real fix that should have been there all along.

Fixed

Sleep now hard-kills in-flight PowerShell children. v0.11.25 added a between-turn cancel check, but a tool already mid-flight (a 30 s outlook_send_email or 60 s word_to_pdf) would happily run to completion — the agent loop was awaiting the tool's promise. New: tools.ts maintains an AbortController registry of currently-running child processes; setMode('sleep') fires ac.abort() on every one, sending SIGKILL and rejecting the awaited promise immediately. Wired through every COM script + screenshot + OCR call.
Removed the buggy Outlook registry probe. v0.11.25 added Test-Path 'Registry::HKEY_CLASSES_ROOT\Outlook.Application' as a pre-check before attempting the COM call. The probe returned false on a user machine that DOES have classic Outlook installed — Clippy false-claimed Outlook wasn't available. The probe is gone. The .ps1 script's own New-Object -ComObject Outlook.Application + catch { Fail } already handles unavailable installs cleanly.
runComScript now reads JSON errors from stdout. The actual root cause of the original "empty stderr" issue: every Tier-2 COM script's Fail() helper writes {ok:false, error:"..."} to stdout then exit 1. execFileAsync rejects on exit-1, but the rejection's err.stdout field was being ignored — only err.stderr was checked. Now: parse JSON from err.stdout first, surface as Error: <reason> if ok=false. This is the real fix v0.11.25's registry probe was trying to paper over.

Smoke gate 14/14 PASS again — no regressions to the pre-ship integration tests. outlook-send-email still succeeds with edge-case bodies, create-reminder still creates + cleans up safely.

v0.11.25 — May 7, 2026

Headline

The hallucinated-success fix. Triggered by a fresh log report where v0.11.24 claimed "Email sent!" when nothing actually got sent. Plus 4 parallel deep audits (script security, execution path, logging, smoke testing) — every finding above P2 landed in this build. First release shipped through a smoke-test gate (14/14 pass).

Fixed (the user-reported bug)

Hallucinated success guard. Previously, if every destructive action (send/post/submit/create) failed, Clippy could still emit a confident success message ("Email sent!"). Now: a per-task ledger tracks every destructive-tool outcome; at task-end, if the model's spoken text matches success language ("sent", "posted", "submitted", "created", etc., past tense) AND no destructive attempt actually succeeded, Clippy's reply is overridden with an honest "I tried but couldn't confirm it worked." Lying to the user is worse than failing visibly.
Outlook (new) detection. Windows 11 ships "Outlook (new)" (process: olk.exe) by default — a WebView2 app with no COM/MAPI surface. Calling Outlook.Application against it crashed the script with exit 1 and empty stderr (invisible failure). outlook_send_email now probes the registry for classic Outlook before the COM call. If absent, returns a clear error directing the agent to CDP/web instead — and explicitly tells it NOT to claim success without verification.
OCR pipeline diagnostic logging. captureAndOcr previously returned null silently across 5 different failure modes. Now logs each — script missing / screenshot crash / temp PNG never written / OCR-script exit / non-JSON stdout / no elements array — so future failures are debuggable from the log alone.

Fixed (subagent-audit findings)

Cancel-during-tool — cancelRequested was only checked between turns. A 30-second outlook_send_email ignored sleep mode and user-overrides. Now: every tool dispatch checks first, pushes a synthetic "cancelled" response, breaks the loop.
v0.11.21-class raw-arg bug patched in 5 more scripts: com-outlook-create-event (subjectB64, locationB64, bodyB64), com-write-file (contentB64), com-http-request (headersB64, bodyB64), com-speak-text (textB64), com-run-powershell (scriptB64). All accept new -*B64 args; raw -* kept for back-compat.
P0 cmd-injection vulnerability in com-create-reminder. Previous impl interpolated user-supplied $title and $notes into a single-quote PowerShell string passed via -Command; the '' doubling escape was insufficient and a notes string of ''); calc.exe; # would close the quote and execute the trailing command. Now: writes title + notes to a JSON sidecar in temp; schedules the task to invoke a new show-reminder.ps1 helper that reads the JSON. Task arguments contain only paths we control. No user input is re-parsed.

Added

Smoke-test harness — new npm run smoke command with three layers: pure-function unit tests (~1 s), real PowerShell shell-out integration tests (~10 s), and a printed manual checklist. Every future ship runs this gate before tagging.

v0.11.24 — May 7, 2026

Headline

Architectural root-cause fix for auto-update + a modern speech bubble. After twelve patches across v0.11.6 → v0.11.23 trying to programmatically launch the installer, we accepted that the failure mode is structural and switched to a one-click manual launch that works 100% of the time.

Fixed (auto-update — finally)

Auto-update no longer fails silently after download. Three different launch mechanisms — quitAndInstall(silent), quitAndInstall(non-silent), shell.openPath, child_process.spawn(detached) — all hit different walls (SmartScreen kills oneClick installers with low publisher reputation, Electron's invisible-window flag, parent-process file locks, AV heuristics). Replaced with: Unblock-File the cached installer (strip Mark-of-the-Web), then shell.showItemInFolder() opens Explorer with the installer selected, user double-clicks, NSIS runAfterFinish auto-relaunches Clippy on the new version. One extra click, zero race conditions.

Modernized

Speech bubble redesign. Replaces the gray-bordered, pale-yellow, hard-shadow look from v0.11.x with a modern messenger card: frosted near-white background (with backdrop-filter: blur), soft 3-layer drop shadow (close + ambient + depth), 14px radius, hairline ink-tinted border, system font stack with proper antialiasing, refined chat-history scrollbar, gradient send button with focus-ring + press feedback. Tail is now a small rotated rounded square with an "echo" dot — iMessage style — instead of the old CSS-triangle hack. Defensive dark-mode variant included.

v0.11.23 — May 6, 2026

Polish & speed pass

Five surgical changes (~75 LOC) on top of the v0.11.22 foundation — three of them are response-to-user-report fixes, two are forward improvements.

Added

Screenshot downscale to 1024-wide. Captures wider than 1280px now downscale (bicubic) before being sent to the model. Tool result text explicitly states the scale factor and instructs the model to multiply when clicking from a screenshot pixel; read_screen / ocr_read_screen / smart_click coords stay native and don't get rescaled. LLMs are measurably more accurate at coordinate prediction on ~1024-wide images than 2560+. Side benefits: ~3× token savings per screenshot, faster upload, smaller log footprint.
Selective post-tool verification. The set of tools that triggered an automatic read_screen after firing was shrunk from 13 → 8 (kept: open_app / focus_window / smart_click / mouse_click family / navigate_browser; dropped: type_text / smart_type / key_press / mouse_scroll / write_clipboard). Saves 2-3 seconds end-to-end on a typical 12-step task. The model can still call read_screen voluntarily.

Fixed

Wake-greeting / name-prompt repetition. The proactive loop fired a generic "Hi! Click me to chat..." right before the post-onboarding "What should I call you?" — back-to-back chatter the user noticed as Clippy "tending to repeat at the beginning." Wake greeting is now suppressed if the user profile has no Name yet (the name prompt is about to fire and is what the user actually needs). Bonus: when Name IS known, the greeting personalizes — "Hi Amr! Click me to chat — I can help with whatever you're working on."
Duplicate updater registration. Both launchWithOnboarding (no license) AND the post-onboarding IPC handler called initUpdater + startPeriodicUpdateChecks, which registered electron-updater event listeners twice (every download-progress / update-downloaded fired 2×) and scheduled two 24h timers (every periodic check ran 2×). Both functions are now idempotent.
type_text truncated return string misled the model. Old return was Typed: ${text.substring(0, 50)} — no length, no ellipsis. Clippy typed all 84 chars correctly via clipboard-paste (Ctrl+V is byte-exact), but the truncated tool result ended at "...joy an" and the model misread that as evidence the typing failed mid-sentence and reported it to the user. Now returns Typed N chars: "first 77 chars..." with explicit length.

v0.11.22 — May 6, 2026

Headline

Root-cause fix for screenshot-coordinate failures. Two-tier smart_click: UIA first, then local Windows OCR fallback when UIA misses. No more wrong-by-100s-of-pixels clicks from LLM screenshot-guessing.

Added

Two-tier smart_click — UIA → Windows.Media.Ocr fallback. Fuzzy-matches the target text against on-screen elements and clicks the matched element's pre-computed center. Works on WebView/Electron/canvas content where UIA returns empty (Edge web pages, modern Outlook, custom-rendered apps). No LLM round-trip for coordinates.
App guides bundled — 11 ClawdCursor JSON guides ship in the installer (Outlook, Excel, Slack, Discord, Edge, Paint, Notepad, Spotify, Figma, more). Brain injects the matching guide into the screen-context block whenever the active window's processName matches. Includes hand-crafted workflows, keyboard shortcuts, and tips ("Send is BLUE; press Ctrl+Enter to send, do NOT click").
Per-app learned-workflow memory — local-only via electron-store. After a successful task, the action sequence is saved under the active app's process name. The next time a similar request arrives in the same app, the steps are injected as a hint. Capped at 20 workflows per app, FIFO eviction. Default ON.
ClawdCursor 0.8.7 → 0.8.8 (vendored). Foreground-window OCR ranking trick cherry-picked into Clippy's fuzzy matcher.

Fixed

outlook_send_email crashed silently on multi-line bodies with em-dashes / smart quotes. Root cause: PowerShell's command-line tokenizer split on embedded newlines and chopped the body. Now base64-encodes subject + body before passing as args; .ps1 decodes back to UTF-8.
runComScript now captures stderr in the error path so future failures show the actual PowerShell error instead of just "Command failed: powershell.exe ...".
Latent HiDPI double-scale bug killed across mouse_click / mouse_drag / mouse_scroll / mouse_double_click / mouse_right_click / mouse_hover. Old code multiplied by Math.round(screenScale) — a no-op at scale=1, but landed clicks 2× off on HiDPI laptops (scale=1.5 / 2.0). All mouse_* now treat (x,y) as physical pixels — same space as OCR / UIA / screenshots.

v0.11.21 — May 5, 2026

Fixed

Auto-update relaunch: swapped shell.openPath for child_process.spawn(detached:true). The previous path inherited Electron's invisible-console flag, so NSIS never received a foreground window and silently exited. Users on v0.11.20 and earlier who hit the no-installer-after-restart bug should be auto-upgraded on next launch.

v0.11.20 — May 4, 2026

Changed

Auto-updater install path retried with non-silent NSIS launch (later superseded by v0.11.21's spawn fix).

Known issues

Restart-and-Install would quit Clippy without showing the installer UI — root-caused and fixed in v0.11.21.

v0.11.19 — Late April 2026

Added

Code signing via Azure Trusted Signing. All installers and shipped .exe files are now signed end-to-end. Publisher: Amro Dabbas. RFC 3161 timestamped so signatures remain valid after the certificate's natural rotation.
SmartScreen reputation accumulation begins from this build forward.

v0.11.16 — Mid April 2026

Added

Outlook native skill (read/draft/send via accessibility-tree control of the Outlook desktop app).
Excel read/write skill — "Add this expense to my budget sheet" writes the row, saves the file, reads cells back on demand.
Calendar booking + summarization workflow.
Browser automation via Chrome DevTools Protocol — direct DOM access, no fragile UI clicking.

v0.9.5 — Earlier 2026

Added

Boot log at %APPDATA%\ClippyAI\boot.log — captures each startup stage so installer-vs-runtime crashes can be diagnosed from a single attached file.

Older releases

Need pre-0.9 release history or an older build? Email hello@clippyai.app and we'll point you to it.

Update mechanism

ClippyAI auto-updates on launch via electron-updater. Updates are signed end-to-end and verified against latest.yml + per-file blockmaps before they install.

To check manually: Settings → About → Search for Updates. To roll back, email hello@clippyai.app for a direct link to an earlier build.