Last Week in AI: Anthropic, more evals, and usage revenue
Hey there, I started doing a weekly review of my Reflect notes and thought I'd start publishing a weekly lookback at what actually mattered from the week. Here's what made the cut:
Two updates from me: 1. I published Which models know sales? this week. I built salesevals.com, a benchmark where models read synthetic B2B sales calls and write coaching notes. It's a needle in a haystack approach where we generate call transcripts with known strengths and weaknesses to see if the coach catches them. The live leaderboard now has 25 call transcripts and 575 judged outputs. GPT-5.4 high is still leading. Surprisingly, the new Opus 4.8 runs landed in the middle: high scored 89.4. My read: labs (especially Anthropic) are focused on coding, and sales is still its own game. A model can sound like a sales manager pretty easily. Telling the difference between "good sales" and "bad sales" is still hard. 2. I also published Why the SaaSpocalypse is fake news. tl;dr: agents create a ton of usage which increases revenue. A human asks one question. The agent pulls from Snowflake, Salesforce, Slack, Notion, Gong, docs, email, and whatever else has the context. It retries. It asks follow-ups. It might even run on a recurring schedule. One user action becomes may billable activities. The SaaS doomer story starts with seat compression. Usage-based pricing can flip that math: 80% of the seats using 10x more will mean MORE revenue, not less. Dan Shipper has an interesting corollary of this on Lenny’s podcast: if the user brings the tokens, the app isn’t paying for all the intelligence it routes. OpenAI is trying to go this route as they're more open with where you can spend your subscription usage. As for next week, rumor has it we might be getting GPT-5.6 and Apple is hosting WWDC, although I'm not entirely sure Apple belongs in an update about AI? LFG, |

The newsletter
Don’t miss the next one.
Field notes on GTM engineering and the craft of shipping software in the AI era — straight to your inbox.
No spam. Unsubscribe anytime.