Infrastructure for verified CUA datasets.
Turning real desktop workflows into verified CUA training data.
Desktop is a Windows app where signed-in users capture local workflows, run computer-use agents, and manage workflow data for training, evaluation, and RL.
For frontier labs that need private and operational desktop workflow data.
workflow evidence pipeline
real_workflow_to_verified_task_package
Real workflow
human run
Task package
trajectory + outcome
Verifier audit
pass/fail + reward
Eval report
pass@k + failures
workflow evidence pipeline
real_workflow_to_verified_task_package
From local workflows to CUA-ready datasets.
Desktop is an end-to-end platform for turning desktop workflows into verified CUA training data.
Teams can capture workflows, normalize traces, label outcomes with verifiers, train models, evaluate performance, and run computer-use agents from one UI.
- Inbox (3)
- Starred
- Sent
- Drafts
Computer-use agents need real workflow data.
Synthetic tasks do not teach agents how real work breaks.
The strongest signal does not come from generic synthetic data or benchmark-shaped tasks. It comes from real workflows: people solving real problems across messy tools, changing screens, mistakes, corrections, and final outcomes.
That is the data Desktop is built to produce.
synthetic tasks
verified workflow packages
source
PDFs / Excel / portals
traces
screens + actions
failures
model weak points
reward
verified outcomes
Better workflow data. Better CUA models.
Desktop turns operational workflows into verified training and evaluation data, so a base model can improve on the same kind of task.
The hard part is not recording workflows. It is verifying the unverifiable: proving what happened, whether it worked, and whether the model improved.
same workflow family
Base model to workflow-trained
Pricing.
Individual
- ✓ Windows local runtime
- ✓ Routine and skill capture
- ✓ Exportable workflow packages
Enterprise
- ✓ Private workflow data supply
- ✓ Custom verifier authoring
- ✓ Eval and RL-ready datasets
Want to inspect the package shape first? Explore examples.
Frequently asked questions.
Do I need Desktop? +
Yes if real workflow data matters. Individuals can train agents on personal routines or package valuable workflow data. Teams can let non-developers keep working while Desktop turns their work into clean CUA trajectories and verifier-ready task packages. Enterprises can use Desktop when they need recurring, verified workflow data for post-training, evals, and RL.
How do customers work with Desktop? +
There are two paths. Teams can use Desktop as end-to-end workflow data infrastructure: employees keep doing real work while Desktop captures, normalizes, labels, verifies, and packages it for CUA post-training. Or UseDesktop can supply high-quality verified workflow datasets with proof: active testing, verifier audits, model comparisons, failure traces, and eval/RL signals.
What does Desktop produce? +
Verified CUA workflow packages: task definitions, human trajectories, UI states, outcomes, verifiers, eval reports, failure traces, and reward signals.
Why not just use synthetic tasks? +
Synthetic tasks are usually short, clean, and known-state. Real desktop work is long-horizon, messy, and spread across PDFs, Excel, portals, files, and legacy apps.
What makes a package verified? +
Every package ties the task goal to a trajectory, final outcome, verifier, failure cases, and scoring signal so model attempts can be evaluated instead of eyeballed.
How do you prove the data is good? +
We measure solvability, ambiguity, verifier false positives and negatives, pass@k across models, failure modes, and contamination risk.
Is this a one-off dataset? +
No. The goal is recurring workflow data capacity: as apps and workflows change, the task packages, verifiers, eval sets, and training signals keep improving.