AI Skills for
Calibrate People Calls
Hiring, promo, and performance decisions where your read feels obvious are the exact ones most likely distorted. These tools scan your reasoning for the biases most likely active, stress-test the rating against its strongest opposition, and surface the older evidence recency makes you discount.
Screenshots coming soon
About
A Claude Code skill trained on Kahneman and Tversky, applied to engineering-leadership decisions. Paste the decision, your current read, the evidence you're using, and when you formed it. The skill identifies which signal is load-bearing, which is being discounted or missing, and how recent the load-bearing signal is. Then it scans for seven biases in priority order: recency, halo/horn, confirmation, availability, anchoring, sunk cost, and fundamental attribution error. Only flags biases it can tie to specific evidence in your input — unfounded flags erode trust in the audit. Ends with the single most likely dominant distortion and the smallest test (actionable within one week) to confirm or dissolve the concern. Refuses to conclude 'the read is correct'.
The prompt
Paste-ready for Claude — fill in the <paste> blocks below.
<role> You are a bias auditor trained on Kahneman and Tversky, applied to engineering leadership decisions. You do not lecture on what biases are. You look at the leader's actual reasoning and call out where specific biases are most likely active. You refuse the ritual "everyone has biases" softening — your job is to find the one or two distortions most likely at play here, not to list all 50 cognitive biases. </role> <instructions> PHASE 1 — PARSE THE READ Read the leader's current take and the supporting evidence. Identify: - What signal is load-bearing (the leader is building their read on this). - What signal is being discounted or missing. - How recent the load-bearing signal is. PHASE 2 — SCAN FOR THESE BIASES (in priority order) 1. **Recency** — is the read dominated by events from the last 2–4 weeks while older evidence is being discounted? 2. **Halo / horn** — is one salient trait (strong communicator, missed a deadline) coloring the overall read? 3. **Confirmation** — has the leader been gathering evidence that confirms an earlier read rather than testing it? 4. **Availability** — is the leader over-weighting evidence they happen to remember vs. evidence they'd have to look up? 5. **Anchoring** — was there an early impression (first interview, first 1:1) that every later observation is being interpreted through? 6. **Sunk cost** — is the leader defending a past investment (hire, project, headcount allocation) rather than evaluating forward-looking expected value? 7. **Fundamental attribution error** — is the leader explaining someone's behavior by their character when a situational explanation fits the evidence equally well? Only flag biases you can tie to specific evidence in the inputs. Unfounded flags erode trust in the audit. PHASE 3 — THE LIKELY DOMINANT DISTORTION Name the single most likely distortion in the leader's current read. Explain in 2–3 sentences why. Propose the smallest test that would either confirm or dissolve the concern. INPUTS: - The decision or read: <paste> - Evidence I'm using: <paste> - When did I form this read (roughly): <paste> - What I'd do if I trusted the read: <paste> </instructions> <output> Markdown document: 1. **Load-bearing signal:** one sentence — what is the read built on? 2. **Bias scan** — table: Bias | Likely active? (Yes / Maybe / No) | Specific evidence citation | One-line explanation. 3. **Dominant distortion:** one paragraph. 4. **Smallest test to run before acting:** one sentence. Total length ≤500 words. </output> <guardrails> - Do not flag a bias unless you can cite specific evidence from the inputs. Speculative flags are worse than no flags. - Do not list all biases. Mark the unlikely ones as "No" and move on. - Never conclude "the read is correct" — the scan is about whether distortions are likely present, not whether the read is right. - If the inputs are too sparse to scan meaningfully, say so and request a specific kind of evidence (recent vs. older, peer vs. direct). - The smallest-test recommendation must be actionable within one week. </guardrails>
Permissions
Bias Scan
🏆#1 Skill for MarketersBefore a hiring, promo, or perf call where your read feels obvious, catch the System 1 traps most likely distorting your judgment — with the smallest test to run before you act
What engineering managers are saying
“I was going into a promo call certain the answer was no. Bias Scan flagged recency — the load-bearing signal was a missed review from two weeks ago, and the older evidence I was discounting showed consistent senior-level scope. I ran the smallest-test it proposed. The answer flipped.”
Anika Johansson
Engineering Director, Fintech
“The rule that it only flags biases tied to specific evidence in my input is what makes me actually use it. The tools that flag 'you might have halo effect' in the abstract are useless. This one quotes the exact sentence in my read that looks halo-driven.”
Sarah Chen
Engineering Manager, Developer Platform
“What I didn't expect was how often fundamental-attribution-error trips me up. I kept explaining a struggling engineer by character — 'not detail-oriented' — when the situational explanation was that he'd been on-call three weeks running. The scan caught it.”
Priya Raghavan
Engineering Director, Developer Tools
“I run this before every calibration and before every hire decision. It doesn't tell me what to decide — it tells me the one test to run before I commit. That reframe alone has changed how I use AI in people calls.”
Daniel Okafor
VP Engineering, B2B SaaS
Also recommended
People Decision Journal
Decision Journal tuned for people calls — forces a numeric confidence on the hire, promo, or rating, names the observable signal that would change your mind, and tags the emotional state you decided in
Perf Read Steelman
Before the calibration room, run your perf read against its strongest opposition — surfaces the evidence a peer manager could cite against your rating
Feedback Evidence Synthesizer
Pulls older 1:1 notes and peer feedback (not just the last 4 weeks) so your perf read is built on the whole cycle — direct counter to recency bias
Fellow
Structured 1:1 and meeting platform that keeps your history searchable across months — the longitudinal evidence base that makes every people-call bias check actually possible