Verifiable RL environments for computer-use agents.
Turning real desktop workflows into verifier-backed data and RL environments for computer-use agents.
Desktop is a Windows app for turning local workflows into verified training, evaluation, and RL data for computer-use agents.
For frontier labs that need private and operational desktop workflow data.
workflow evidence pipeline
real_workflow_to_verified_task_package
Real workflow
human run
Task package
trajectory + outcome
Verifier audit
pass/fail + reward
Eval report
pass@k + failures
workflow evidence pipeline
real_workflow_to_verified_task_package
From local workflows to computer-use datasets.
Desktop is an end-to-end platform for turning real desktop workflows into verifiable RL environments and training datasets for computer-use agents.
Teams can capture workflows, normalize traces, label outcomes with verifiers, train models, evaluate performance, and run computer-use agents from one UI.
- Inbox (3)
- Starred
- Sent
- Drafts
Computer-use agents need real workflow data.
Synthetic tasks do not teach agents how real work breaks.
That is why the strongest learning signal for computer-use agents comes from real workflows, where people solve real problems across messy tools, changing screens, mistakes, corrections, and final outcomes.
That is the data Desktop is built to produce.
synthetic tasks
verified workflow packages
source
PDFs / Excel / portals
traces
screens + actions
failures
model weak points
reward
verified outcomes
Verified data makes agents better.
We started with tasks the base model failed. After rollouts in our verifier-backed RL environments, the improved model solved new variants of the same workflows.
same workflow family
Base model to workflow-trained
Pricing.
Individual
- ✓ Windows local runtime
- ✓ Personal workflow capture
- ✓ Agent runs on local workflows
- ✓ Exportable workflow packages
Team
- ✓ Team workflow capture workspace
- ✓ Non-developer workflow collection
- ✓ Verifier-backed task packages
- ✓ Training and eval exports
Labs & Enterprise
- ✓ Verified workflow dataset supply
- ✓ RL task packages and verifiers
- ✓ Failure traces and reward signals
- ✓ Model comparison reports
Want to inspect the package shape first? Explore examples.
Frequently asked questions.
How do teams use Desktop? +
A domain worker completes the task in the Windows app. Desktop captures the workflow, turns it into a verified package, and exports it for training, evaluation, or RL rollouts.
What does Desktop produce? +
Desktop produces verified computer-use datasets and RL task packages with the trajectories, verifiers, evals, and reward signals needed for post-training.
Why not just use synthetic tasks? +
Synthetic tasks are usually short, clean, and known-state. Real desktop work is long-horizon, messy, and spread across PDFs, Excel, portals, files, and legacy apps.
What makes a package verified? +
Every package ties the task goal to a trajectory, final outcome, verifier, failure cases, and scoring signal so model attempts can be evaluated instead of eyeballed.
How do you prove the data is good? +
We measure solvability, ambiguity, verifier false positives and negatives, pass@k across models, failure modes, and contamination risk.