Let's say you're using GPT-4o-mini for agentic work.
Your agent breaks the task of extracting financial information for a company into smaller subtasks. Let's say the probability it does each subtask correct is 90%.
With this, the errors compound. If a task is even moderately difficult with four subtasks, the probability of the final output being good is extremely low.