🔗https://www.oneusefulthing.org/p/real-ai-agents-and-real-work?utm_source=substack&utm_medium=email
By Ethan Mollick
1. Introduction
Mollick opens by pointing out that AI has quietly crossed a threshold: it can now complete tasks that have real economic value. He cites a recent OpenAI test in which expert practitioners in law, finance, retail, etc., created tasks that would typically take 4–7 hours for a human to finish. In blind judging, AI came very close to matching expert work—falling short mainly in formatting, instruction following, or polish. The implication: AI is not far from being able to perform meaningful work, at least at the task level.
However, Mollick cautions that doing tasks is not the same as capturing a job. Jobs are a bundle of tasks, some of them deeply social, contextual, or requiring judgment over long time spans. Even if AI overtakes many task types, human roles might shift rather than disappear. The art is in identifying which segments of work are automatable and which remain inherently human.
2. A Very Valuable Task
To illustrate, Mollick describes an experiment he ran: giving Claude (a later AI model) a complex economics paper and its dataset, and asking it to replicate the paper’s findings. Without step-by-step guidance, Claude translated code, worked through statistics, and generated results that passed human spot checks. What might have taken hours of domain expertise could now be done by AI, opening a pathway for automating labor-intensive academic tasks like replication or verification.
He notes that replication work—often under‑resourced and tedious—has long been a weak point in many scientific disciplines. If AI can take it on reliably, it could transform how research is validated, checked, and even expanded. While it’s not perfect, it signals that domains we thought were too niche or complex for AI may now be within reach.
3. Agents at the Heart of It All
Mollick dives into what enables this step change: AI agents. Unlike simple prompt-based systems that require constant human steering, agents can plan, chain steps, and use tools (search, code, external APIs). Because modern large models are much more accurate and self-correcting, small improvements in error rates yield exponential gains in what agents can reliably do.
He also points to the metric of how many steps an AI can take autonomously with at least 50% success—tracking progress from GPT‑3 onward. Agents are no longer fringe. They can tackle multi-step pipelines with minimal oversight, serving as the backbone of real, productive AI systems.
4. How to Use AI to Do Economically Valuable Things
Here Mollick warns against naive automation: letting agents perform tasks mindless of purpose risks flooding workplaces with useless or redundant outputs (e.g. dozens of variant PowerPoints). Instead, he proposes a hybrid workflow: humans delegate tasks with instructions, review outputs, and correct or re-prompt if needed. If the AI fails, revert to doing it manually. This mix can boost speed (estimates: 40% faster, 60% cheaper) while maintaining human oversight.
He stresses that the future of work with AI depends not just on capability, but judgment: deciding which tasks to automate, when to supervise, and how to integrate agents in a way that amplifies human purpose rather than drowns it in low-value output




Great insights. The distinction between tasks and jobs really resonates. AI can accelerate repetitive work, but judgment and context remain human. The replication example shows how AI could free up time for higher-value thinking, much like how Anaplan helps automate complex modeling so teams can focus on strategy. The hybrid workflow feels like the right path forward.