How I Learn from Top Performers Before Building AI Agents
·8 min read
Ask an SDR what data they use to qualify inbound leads, and they'll give you a thorough answer: firmographic data, intent signals, technographic info, call recordings, support tickets. These are genuinely useful inputs, and good SDRs know it. But what people say they use and what actually drives their best decisions aren't always the same thing. That gap is where the real insights hide.
Shadowing is the key to finding revealed preference. Stated preference gives you the dream. Revealed preference shows you what on that wish list is critical to ship. (I wrote more about this mental model in Stated vs Revealed Preference.)
| Stated Preference | Revealed Preference | |
|---|---|---|
| Source | Interviews, surveys, wishlists | Observation, behavior data, shadowing |
| Strength | Easy to collect, surfaces aspirations | Shows actual priorities |
| Weakness | May reflect what they think matters | Requires access and time |
| Best for | Generating hypotheses | Validating what to build |
I learned this building Lead Agent at Vercel. We started by shadowing the data first, pulling historic contact sales submissions from Snowflake and trying to match AI output to what humans had actually decided. Then we made it easy for SDRs to flag negative feedback. Anyone who flagged something got shadowed. We sat down with them, watched them work, and discovered friction points we never would have found by asking.
Why This Matters for Agents
Here's the thing about agent context: each piece of information doesn't help equally. There's no one-to-one correlation between context and performance. Some amount of context does 95% of the heavy lifting. It's a power law distribution. (Think 80/20, but often more extreme. A handful of signals carry most of the predictive weight.)
If you equal-weight everything on someone's wish list, you aren't respecting that power law. Worse, you muddy up the context window with noise that actively hurts performance. Revealed preference gets you closer to what actually matters. You want to find the handful of signals that drive real decisions, not the long tail of "nice to haves."
Finding Who to Shadow
Sales is easy because you know the top performers. Revenue attainment, meetings booked, highest response rate on cold outbound — follow the data.
Other functions are harder. You do have to rely on taste. Ask around. Look for people who consistently ship, who others ask for help, who seem to have figured something out. Then watch them work.
The Shadowing Method
When you sit down with someone (in person or over Zoom), ask if you can record. Tell them to assume you can't see their screen and to talk through everything. Dictate heavily. This sounds awkward but produces gold.
First pass: stay silent. Let them run through the full task without interruption. You're building a baseline understanding of the workflow end to end.
Second pass: ask why at each step. Now you probe. Why did you click there? Why did you skip that field? What made you decide this was worth pursuing? You're trying to separate what's critical from what's habit. You're also looking for friction they overcome anyway — that's the most important signal. If there's friction to doing something but they do it anyway, that's revealed preference in action. That thing matters.
Reverse shadow: ask to drive. This is the part most people skip. Try doing their job while they watch. You want an exact mental model that matches theirs. If they're a top performer, they've used their brains to arrive at a great solution. Make sure you actually understand it before you try to apply AI to it.
I call the gate here the State It Back test: you don't understand someone's position well enough until you can state it back to them and they say yes. If you can't pass this test, you're not ready to build.
(UX researchers call this general approach "contextual inquiry": observing users in their natural environment while they work, then asking clarifying questions in context. The key insight is the same: people can explain what they're doing while they do it, but struggle to accurately recall it afterward. Nielsen Norman Group has a good primer if you want the formal framework.)
A Real Example
Nicole, our main SDR partner on Lead Agent, had a friction point we never anticipated. There was a 10 to 15 minute delay because of how lead routing worked with LeanData. She literally sat around waiting for the agent to work. In our metrics, this looked like we weren't saving as much time as we should have been.
But here's the thing: all AMER leads go to her anyway. We could skip the routing step entirely. We never would have found this by asking what she wanted. We found it by watching what actually happened.
After Shadowing: The Prototyping Loop
You'll leave a shadowing session with intuitions about where you can help. Don't build a whole system. Prototype immediately.
The loop looks like this:
Shadow data first → Shadow power users → Prototype → Ship → Repeat
I create a quick agent using the AI SDK, write a basic prompt, and hardcode the context I think matters. Then I run it a few times just to see outputs. You don't even need code for this. Open Claude, write the task, manually paste in context, see if you're directionally correct.
The PRD trick: After a discovery session, I do a big voice note rambling through my thoughts. I transcribe it, dump it into a Claude project along with the recording and my notes, and ask it to draft a product requirements document. Then I read the PRD and find the gaps. Because I have all the context fresh, the output is dramatically better than starting from scratch. The PRD becomes a forcing function for clarity.
Then let it marinate. This is the part that feels unproductive but isn't. Let these insights sit in your brain. Maybe shadow again after you write the PRD. You're trying to ensure you have a great mental model of the problem before you commit to a solution.
Once you've got that mental model — once you can pass the State It Back test — don't waste time. Get it into code so you can start seeing real outputs.
Three Traps to Avoid
Don't just ask what they want. The Henry Ford quote about faster horses is probably apocryphal (no evidence he said it before 1999, and he died in 1947), but the underlying point holds in the age of AI. Most people are still early in understanding AI capabilities. Even people who are great with ChatGPT think in chat sessions, not workflows. They don't have muscle memory for applying AI to their work.
Don't rush it. Take time to sit with the problem. Shadow a couple times. Read the PRD. Let your brain process before you commit to an approach.
Don't automate a broken process. This is the biggest trap. Shadowing assumes the org you're automating is already in an ideal state. Be cautious of building software for things you know aren't ideal.
We had this problem with Lead Agent. The team structure wasn't optimal, and we didn't want to replicate that behavior in automation. So we moved to one person instead of spreading things across a bunch. It's like migrating a WordPress site to Next.js: if you copy everything pixel for pixel, maybe the design was dated, and you should have updated it while you were in there.
When to Keep Humans in the Loop
While you're doing discovery, you can usually spot intuitively where human judgment is necessary. What you're looking for are one-way doors. In Jeff Bezos's framing from his 2016 Amazon shareholder letter:
"Type 2 decisions can and should be made quickly by high judgment individuals or small groups... As organizations get larger, there seems to be a tendency to use the heavyweight Type 1 decision-making process on most decisions, including many Type 2 decisions. The end result of this is slowness, unthoughtful risk aversion, failure to experiment sufficiently, and consequently diminished invention."
If something is a one-way door and you have a human currently doing that task, you should probably keep a human in the loop. Your business has already shown its preference by employing someone to make that call. In tech, they're paying at least $50k a year for that judgment. That heuristic is a bit reductive, but it's a useful starting point.
The example I always use is email. There are only 500 companies in the Fortune 500. If you're sending at some crazy rate, you will eventually exhaust all of them. People remember, and spam filters do too. The best approach: ping humans in Slack for go/no-go/edit decisions on anything that matters.
The Books I Stole From
This method borrows heavily from two sources:
The Mom Test by Rob Fitzpatrick covers discovery principles for customer conversations. The core idea: stop asking if your idea is good. Ask about their life, their problems, and how they solve them today.
Continuous Discovery Habits by Teresa Torres covers user research questions for product work. The core idea: weekly customer touchpoints, not periodic research projects.
The quickest way to learn is to do. Shadow someone this week.
Stay up to date
Get essays and field notes delivered as soon as they publish.