OpenAI Codex Goes 'For Almost Everything': Mac Computer Use, Browser Comment Mode, and Thread Automations Explained
OpenAI shipped a major Codex update on April 16, 2026 that pushes the product past coding into general work. Three changes matter: Codex can now drive your Mac apps directly, an in-app browser captures both screenshots and DOM elements through 'comment mode', and Codex threads can run continuously to watch Slack, email, and PRs. Here is how each works, the workflows that justify each one, and where Codex now sits relative to Perplexity Personal Computer and Claude Code Routines.
OpenAI shipped what it called Codex for (almost) everything on April 16, 2026 -- the largest Codex update since the app's original launch. It is the moment Codex stops being a coding tool that occasionally reaches outside the IDE and becomes a general-purpose work agent that happens to be best at coding. Three pieces of the launch carry most of the practical weight: Mac app computer use, an in-app browser with comment mode, and thread automations. The OpenAI tweet announcing it cleared 2.1M views in 24 hours, James Sun's browser comment-mode demo cleared 248K, and Nick Baumann's thread-automations writeup cleared another 33K. The coverage maps to genuine product weight.
This is the deep-dive on what shipped, how it actually works, and where each piece fits into the workflows readers use today.
Key Takeaways
- Mac app control lets Codex drive any Mac app the way a human would: cursor, click, type, read.
- In-app browser with comment mode captures both screenshot AND DOM element when you point at something on a rendered page.
- Thread automations turn a Codex thread into an always-on watcher (Slack, email, PRs, anything wired in via MCP).
- Memory carries preferences and conventions across sessions, so Codex learns how you like to work.
- Image generation is now a first-class output; threads can produce diagrams, screenshots, and visuals as part of their reply.
- All of this lands inside the existing Codex App with no separate subscription change.
What Actually Shipped on April 16
The launch is best understood as five additions that compound rather than five independent features:
| Addition | What it does | Where it lives | |---|---|---| | Mac app computer use | Drive native macOS apps via cursor/click/type | Codex Mac app | | In-app browser | Built-in browser pane inside Codex | Codex Mac app, web | | Comment mode | Point at a rendered element, capture screenshot + DOM as context | Inside the in-app browser | | Thread automations | A thread runs continuously, reacts to external triggers | Any Codex thread | | Memory | Codex remembers preferences across sessions | All Codex surfaces | | Image generation | Codex produces images as outputs | All Codex surfaces | | More tool integrations | More native MCP-style integrations with third-party apps | Codex tool registry |
Together they reposition Codex from "coding agent" to what OpenAI's announcement calls a work agent for "almost everything". The actual ceiling is still software-development-shaped (the strongest workflows touch code or web pages), but the surface is wider than it was a week ago.
Mac App Computer Use: How It Works
The new computer-use feature lets Codex control any Mac app -- not just code editors. The interaction model is straightforward: ask Codex to do something that requires another app, and Codex opens it, navigates, clicks, types, and reports back. Under the hood it uses macOS's accessibility APIs (the same layer tools like Raycast and Alfred sit on) plus a vision model that sees the screen.
Three workflow patterns this enables that were not possible before:
- Cross-app debugging. "There is a layout bug on the pricing page when the viewport is below 768px." Codex opens the page in its in-app browser, identifies the broken CSS, navigates to the React component in the editor, fixes the rule, runs the build, refreshes the in-app browser, verifies the fix visually, and commits.
- Mid-session context pulls. Halfway through implementing a feature, you need a number from a Linear ticket. Without leaving the Codex thread, Codex opens Linear in the in-app browser, finds the ticket, extracts the number, and continues coding.
- Operator-style errands during a coding session. "Open the Apple Mail draft I started yesterday and send it." Codex opens Mail.app, finds the draft, sends it. The line between "coding session" and "general work session" has been erased.
The trust model is the same as macOS's Accessibility permission system. Codex prompts you the first time it tries to control any app; you approve once per app and can revoke later in System Settings. There is no kernel-level sandbox -- Mac app control runs at the same trust level as Codex itself.
For readers thinking about the broader computer-use landscape, our three-way comparison of Perplexity Personal Computer, Codex Computer Use, and Claude Computer Use covers when each shape wins.
In-App Browser + Comment Mode: The Underrated Feature
The in-app browser feature is where Codex now meaningfully differs from a generic computer-use product. It is a real browser, embedded inside the Codex Mac app, that the agent can drive -- but the killer feature is comment mode.
Comment mode works like this: while Codex has a page open in the in-app browser, you can point at any element with your cursor and click. Codex captures three things from that single point-and-click:
- A screenshot of the rendered page at the moment you clicked.
- The DOM element you pointed at (its tag, class, id, attributes).
- The DOM path from the root to that element.
All three feed into the next agent turn as precise context. The contrast with vision-only computer use is sharp: instead of asking Codex to "look at the button on the right side of the page", you literally point at the button and Codex knows the exact element and what it looks like.
Why this matters for real work
A few patterns this unlocks:
- UI bug repro. Point at the broken element. Codex now knows the exact CSS selector, the rendered visual, and where it sits in the DOM. The next prompt can be "fix this" instead of "find the button that looks like..."
- Precise design feedback. Point at the button you want changed, type "make this 8px larger and align right". The fix lands in the right component on the first try.
- Scraping with structure. Point at the data table you want extracted. Codex captures the DOM, knows the structure, and can produce parsed output without the usual selector-guessing.
The non-obvious effect is that comment mode collapses the "describe what you mean" step that usually eats two or three turns at the start of a UI task. You skip directly from "I see the issue" to "fix it".
For broader context on how Codex compares to Cursor and Claude Code architecturally -- including how Cursor's interactive canvases solve a different version of the same "richer than text" problem -- see our Codex vs Claude Code vs Cursor architecture comparison.
Thread Automations: The Always-On Coworker Pattern
The third headline change is that a Codex thread can now run continuously. You set the thread up once with a prompt and connections, and it polls or watches the things you told it to monitor. When something interesting happens, it does the work and surfaces the result back into the thread.
The example Nick Baumann gave on launch day is the canonical one:
Every hour, check my Slack, my Gmail, and any open PRs I wrote. Highlight anything that needs my attention. Skip the noise.
That single setup runs forever. The thread sits in your Codex sidebar; new findings appear in it as Codex picks them up. You glance at it once or twice a day and only act on what made it past the agent's filter.
The pattern generalizes well:
- PR shepherd. "Watch the open PRs in this repo. If a PR has been ready-for-review for more than 4 hours and not picked up, post in #engineering. If a PR has merge conflicts, comment with what conflicts and suggest a resolution."
- Deploy watcher. "Watch the deploy queue. If a deploy fails, fetch the logs, summarize the error, and DM me with the relevant excerpt and a likely cause."
- Customer triage. "Every 30 minutes, check the support inbox. Classify any new tickets, draft a response for the easy ones, and surface the hard ones to me with proposed next steps."
- Personal focus filter. "Every hour, scan my Slack DMs and email. Drop anything spammy. For everything else, write a one-line summary and queue it for my next focus break."
The contrast with Claude Code Routines is worth noting:
| | Codex thread automations | Claude Code Routines | |---|---|---| | Surface | A live Codex thread | A separate Routines dashboard | | Iteration model | Iterate on the prompt in the thread | Update the saved Routine config | | Best for | Personal long-running threads | Org-shared scheduled tasks | | Context model | Thread carries context across runs | Each Routine run is stateless | | Trigger types | Schedule + thread-internal logic | Schedule, HTTP API, GitHub events |
The decision pattern: personal continuous workflows → Codex thread automations. Team-shared scheduled jobs → Claude Code Routines. They are not redundant; they sit in different parts of the same problem space.
For the full five-platform comparison of scheduled and triggered AI agents, see our existing scheduled AI coding agents guide.
Memory: Codex Learns How You Work
The launch added persistent memory across Codex sessions. The model now carries preferences -- the language you prefer for new files, the testing framework your team uses, the conventions you have asked it to follow before -- across sessions, so you stop repeating yourself.
The practical impact is the same as a good CLAUDE.md or AGENTS.md file, but with a different shape: instead of you encoding the conventions explicitly, Codex learns them from how you use it. After a few sessions, it knows you prefer Vitest over Jest, that your team writes commit messages in conventional format, that your Python projects use uv instead of pip.
The trade-off is that you have less control over what gets remembered. If you want explicit, version-controlled conventions, an AGENTS.md is still the right answer. Memory is for the implicit things -- preferences and styles you would not bother to write down but that you want consistency on.
A reasonable workflow: use AGENTS.md for the explicit things ("never use jQuery, always use Tailwind, run lint before declaring a task done"), let memory handle the implicit things ("Zac prefers tabs over spaces, always asks for tests after a refactor"). Together they cover both axes.
Image Generation as a First-Class Output
Codex can now produce images as part of a thread reply. Not just describe images -- actually produce them, inline in the conversation. The use cases that matter for the developer audience:
- Architecture diagrams. "Draw a system diagram of the auth flow." Codex produces an actual diagram, not a Mermaid string for you to render later.
- UI mockups during a feature spec. "What would this dashboard look like with three tiles?" Codex generates a rendered mockup you can iterate on.
- Visual explainers in docs. Generated illustrations for technical posts, READMEs, or onboarding material.
The integration is what matters more than the model. Image generation has been available in ChatGPT for years, but pulling an image out of a coding thread used to require a context switch. Now the image is just another turn in the same thread.
A Five-Workflow Setup for Your First Week
Pick one or two of these as your first concrete experiments:
- The morning thread. Open a new Codex thread Monday morning, give it the "every hour, scan Slack/Gmail/PRs and surface what matters" prompt, and let it run all week. By Friday you will know whether the always-on thread model fits your day.
- The UI bug repro pattern. Next time you have a frontend bug, open the page in Codex's in-app browser, point at the broken element with comment mode, and see how few turns it takes to get a fix. If you came from cursor-paste-screenshot workflows, this should feel meaningfully better.
- The cross-app errand. Let Codex drive Mail or Calendar for a real task you would normally do by hand. The "send the draft I started yesterday" pattern is a good first test -- low stakes, easy to verify.
- The deploy watcher. Set up a thread automation that watches your deploy pipeline. Even if your team has Datadog or Sentry, a Codex thread that summarizes failed deploys in plain English is an underrated layer on top.
- The Linear/Notion sidebar pull. During your next coding session, when you need a number from a ticket, ask Codex to pull it instead of switching tabs. Notice how the workflow changes when "context from another app" stops costing you a tab switch.
Where Codex Now Sits in the Stack
After the April 16 update, Codex's positioning has clarified meaningfully:
- It is the strongest single-thread agent for work that mixes coding with cross-app context. Comment mode plus Mac app control is a unique combination.
- It is competitive for general operator work but not the best at it -- Perplexity Personal Computer is more polished as a Mac-native always-on agent for non-developer workflows.
- It is excellent for personal long-running threads but Claude Code Routines remains the cleaner choice for team-shared scheduled jobs.
- For pure coding work without computer-use, the existing Codex CLI vs Claude Code vs Cursor architecture comparison is the right starting point -- the choice between the three depends more on architecture than on this update.
The biggest practical recommendation for readers: if you already pay for ChatGPT Plus or Pro, you have all of this bundled in. The cheapest path to a real-world test of computer use, comment mode, and thread automations is to spend an hour Monday morning trying them on tasks you actually have. The learning curve is short and the workflows that stick will reveal themselves quickly.
The agent-as-coworker thesis -- that the next phase of AI tooling is agents that run alongside you all day, not chatbots you call when you need help -- got materially closer this week. Codex's April 16 update is the second major proof point of the month, after Perplexity's Personal Computer launch. The pattern is unmistakable: 2026 is the year the agent stops being a tool you reach for and becomes a coworker that is already there.
Frequently Asked Questions
What does 'Codex for almost everything' actually mean?
It means OpenAI repositioned Codex from a coding tool into a general-purpose work agent. The April 16, 2026 update adds Mac app control, an in-app browser with point-and-click context capture, image generation, persistent memory across sessions, integrations with more third-party tools, and continuously-running thread automations. Coding remains the strongest use case, but Codex is now competitive for general operator work too.
Is Codex computer use available outside macOS?
Not at launch. The April 16 release ships Mac app control as the headline computer-use surface; Linux and Windows support is on the roadmap. The in-app browser and thread automations are platform-agnostic and work everywhere Codex runs, including the web app and the CLI.
How is Codex thread automations different from Claude Code Routines?
Both let an agent run continuously and react to triggers. Claude Code Routines runs as a separate cloud product with its own dashboard; Codex thread automations runs inside an existing Codex thread and surfaces results into the same conversation. The Routines model is better for org-shared scheduled tasks; the thread automations model is better for personal long-running threads that you keep iterating on.
Do I need to upgrade my ChatGPT plan to use the new Codex features?
If you already have ChatGPT Plus, Pro, Business, or Enterprise, the new Codex features are bundled in. There is no separate Codex subscription. Rate limits and concurrency vary by tier -- Pro gets significantly more daily Codex runs than Plus, and Business/Enterprise add admin controls and audit logs.