Join the waitlist for Windows

Verifiable RL environments for computer-use agents.

Turning real desktop workflows into verifier-backed data and RL environments for computer-use agents.

Desktop is a Windows app for turning local workflows into verified training, evaluation, and RL data for computer-use agents.

For frontier labs that need private and operational desktop workflow data.

workflow evidence pipeline

real_workflow_to_verified_task_package

Real workflow

human run

Task package

trajectory + outcome

Verifier audit

pass/fail + reward

Eval report

pass@k + failures

what is desktop?

From local workflows to computer-use datasets.

Desktop is an end-to-end platform for turning real desktop workflows into verifiable RL environments and training datasets for computer-use agents.

Teams can capture workflows, normalize traces, label outcomes with verifiers, train models, evaluate performance, and run computer-use agents from one UI.

what makes data good?

Computer-use agents need real workflow data.

Synthetic tasks do not teach agents how real work breaks.

That is why the strongest learning signal for computer-use agents comes from real workflows, where people solve real problems across messy tools, changing screens, mistakes, corrections, and final outcomes.

That is the data Desktop is built to produce.

synthetic tasks

clean DOM
known state
toy workflow
vs

verified workflow packages

task goal
action trajectory
verifier
eval / RL signal

source

PDFs / Excel / portals

traces

screens + actions

failures

model weak points

reward

verified outcomes

Does the model improve?

Verified data makes agents better.

We started with tasks the base model failed. After rollouts in our verifier-backed RL environments, the improved model solved new variants of the same workflows.

same workflow family

Base model to workflow-trained

+34 pts
Measured improvement from verified workflow data Line chart with workflow data on the x-axis and task success on the y-axis. 0% 50% 100% Task success Workflow data 52%Base 86%Desktop data
Capture
Verify
Improve
engagement

Pricing.

Individual

20 USD per month
Get started
  • Windows local runtime
  • Personal workflow capture
  • Agent runs on local workflows
  • Exportable workflow packages

Team

299 USD per month
Contact us
  • Team workflow capture workspace
  • Non-developer workflow collection
  • Verifier-backed task packages
  • Training and eval exports

Labs & Enterprise

Custom contract
Contact us
  • Verified workflow dataset supply
  • RL task packages and verifiers
  • Failure traces and reward signals
  • Model comparison reports

Want to inspect the package shape first? Explore examples.

faq

Frequently asked questions.

How do teams use Desktop? +

A domain worker completes the task in the Windows app. Desktop captures the workflow, turns it into a verified package, and exports it for training, evaluation, or RL rollouts.

What does Desktop produce? +

Desktop produces verified computer-use datasets and RL task packages with the trajectories, verifiers, evals, and reward signals needed for post-training.

Why not just use synthetic tasks? +

Synthetic tasks are usually short, clean, and known-state. Real desktop work is long-horizon, messy, and spread across PDFs, Excel, portals, files, and legacy apps.

What makes a package verified? +

Every package ties the task goal to a trajectory, final outcome, verifier, failure cases, and scoring signal so model attempts can be evaluated instead of eyeballed.

How do you prove the data is good? +

We measure solvability, ambiguity, verifier false positives and negatives, pass@k across models, failure modes, and contamination risk.