METHODOLOGY
WinsAbove Methodology
Specification for the Alpha Score. Where the data comes from, how the number is computed, what gets penalized, and what we have not yet built.
Last updated Algorithm v2
1. What the Alpha Score measures
The Alpha Score is a normalized, market-relative measure of sales rep performance computed from CRM data, segmented by role, segment, and tenure cohort. It is a composite of six metrics, each scored against an empirical percentile distribution and converted to a normal-z equivalent, then weighted and transformed onto a 0–5 display scale where 2.5 represents the cohort median. A score of 4.0 means roughly one standard-deviation equivalent above peers in the same segment. A score of 1.0 means roughly one standard-deviation equivalent below.
2. Data sources
Every connection is OAuth, read-only. We never write to the rep's CRM and we never store credentials with write scopes. The 4 surfaces we read:
- Salesforce via the Merge.dev unified CRM integration. We read opportunities, accounts, deal stages, and owner attribution. No contacts, no notes, no pipeline forecasts.
- HubSpot via the same Merge.dev shape. Same surface, same scopes.
- User-uploaded artifacts (W-2s, signed comp plans, commission statements) processed by the
workers/scorer/service using Workers AI vision OCR. These artifacts are used for verification badges only. They do not currently feed back into the Alpha Score itself. - User profile for role (AE / SDR / AM), segment (SMB / Mid-Market / Enterprise), and quota, self-reported on onboarding. We have a planned migration to derive role and segment from LinkedIn instead of self-report.
3. Eligibility and filters
Only deals that pass these gates contribute to the score:
- Closed-won only. No committed, no best-case, no forecast. Pipeline does not count toward Alpha.
- Trailing 12 months from the sync timestamp. Older deals roll off the window.
- Parseable close dates. Records with malformed or missing close-date fields are dropped, not imputed.
- Tenure-aware. Reps with under 90 days of CRM history, or with an unknown
firstDealDate, are flagged provisional and excluded from public leaderboards. Their score still computes. It just ships with a label. - Role and segment classification comes from the user profile today. Mis-classified profiles produce mis-cohorted scores (see Section 9 on disputes).
4. The math (Alpha v2)
Plain English: for each of six metrics we look up where the rep falls in the WinsAbove benchmark distribution (an empirical percentile), convert that percentile to a normal-z equivalent, weight the six z-values, sum them, scale onto a 0–5 line, and subtract any sandbagging penalty. The exact mechanism, pulled from lib/scoring/alphaV2.ts:
Per-metric percentile rank
p = empirical_percentile(value, BENCHMARK_DISTRIBUTIONS[metric]) z = inverse_normal_CDF(p) // probit transform
The benchmark distribution is a set of percentile breakpoints (p10, p25, p50, p75, p90, p95, p99) sourced from public sales-industry data and reconciled against WinsAbove's reference cohort. Linear interpolation between breakpoints gives the rep's performance percentile; the probit transform converts that percentile to a normal-z equivalent so the weighted-sum machinery downstream stays interpretable in standard-deviation terms.
Velocity (average deal cycle in days) is inverted before lookup because a shorter cycle is better: a 21-day cycle sits at raw p10 of the cycle-time distribution, but represents top-10% performance, so the function returns performance percentile 0.90.
Composite weights
The 6 components and their current weights in v2:
- Revenue (trailing 12 months): 35%
- Win rate: 20% (gated by volume multiplier)
- Velocity (avg deal cycle, inverted): 15%
- Average deal size: 15%
- ICP alignment: 10%
- Hunting ratio (new-business revenue share): 5%
Volume multiplier
Thin sample sizes get penalized so a rep with 2 lucky deals does not outscore a rep with 40 grinding ones. The threshold for full credit scales with tenure: 5 deals if under 6 months, 8 deals from 6–12 months, 10 deals after 12. The multiplier is clamped at a 0.3 floor so a ramping rep is not zeroed out entirely.
Sandbagging penalty
If the user-entered close date sits in an earlier quarter than the system-stamped close date (i.e. the deal actually closed in Q1 but was backdated to Q4 of the prior year for quota), each flagged deal subtracts 0.1 from the final score, capped at a 0.3 total penalty.
Final scaling
alphaScore = 2.5 + (weighted_z × 1.5) − sandbagging_penalty alphaScore = clamp(alphaScore, 0, 5)
2.5 is the cohort median. Each 1.5 of weighted Z translates to roughly one full point of Alpha. The percentile shown on the profile uses the same weighted Z and a normal-curve approximation: percentile = 50 + (weighted_z × 34), clamped 0–100.
Why empirical percentile rank, not parametric Z-score
Sales rep performance is heavily right-tailed (Pareto-shaped). A parametric Z-score against a fixed mean and standard deviation reports a $5M-revenue rep as roughly +15 standard deviations above the mean, which is mathematically nonsensical for non-normal data. Under empirical percentile rank, the same rep correctly lands at the 99th percentile (≈ +2.33 standard-deviation equivalent under a normal curve). The downstream weighting and the 0–5 final scale stay calibrated, but the long tail no longer breaks the math. The benchmark distribution is updated quarterly against the WinsAbove reference cohort; the next refresh will incorporate segment-specific breakpoints (SMB / Mid-Market / Enterprise) rather than a pooled distribution.
5. What gets penalized (anti-gaming)
Every shipped penalty corresponds to a specific CRM gaming pattern. Things we have not yet shipped are listed honestly so nobody assumes coverage we do not have.
- Hunter substring exploit (shipped). v1 used
type.includes('new')to count new-business deals, which silently matched the string "Renewal" and inflated Hunter scores for AMs. v2 uses an explicit allowlist ("new business", "new logo", "new", "acquisition", "initial sale") and hard-rejects any string containing "renew". - Sandbagged close dates (shipped). We compare the user-entered close date against the system-stamped close date. If the system date sits in a later quarter than the user date, the deal is flagged and the rep takes a 0.1-per-deal penalty (capped at 0.3).
- ICP misalignment (shipped). Each segment has a deal-size band (SMB: $0–50k, Mid-Market: $25k–150k, Enterprise: $100k+). Closing exclusively below your band looks like cherry-picking and trims up to 0.2 off the ICP factor. Closing above your band looks like selling up and adds up to 0.2. Final ICP factor is clamped 0.8–1.2.
- Volume inflation via deal-splitting (planned, not shipped). One large deal logged as 8 sub-deals to inflate count is not yet detected. Shipping target: q3 2026.
- Connection-shopping (planned, not shipped). Reps who sync only their cherry-picked CRM org and quietly omit others are not yet caught. The plan is owner-email cross-match against undisclosed CRM tenants. Honest disclosure: no ETA.
6. Refresh cadence
Today, scores recompute on user-initiated CRM sync. A rep clicks "Sync now" and the score reflects every closed-won deal up to that moment. The planned upgrade is a nightly cron recompute that stamps each result with a scoring_version and a computed_at timestamp, so two reps viewed on the same day are guaranteed to have been scored against the same algorithm and the same benchmark snapshot. The nightly cron is not yet shipped. If you see a stale score, a manual sync will refresh it.
7. Versioning policy
Algorithm changes follow 4 rules:
- Any change to weights, benchmarks, or filter logic that materially shifts published scores triggers a major version bump (v2 → v3).
- Old scores remain stamped with the version that generated them. We do not silently rewrite history.
- This methodology page is updated within 7 days of any version change. A dated changelog of every revision is appended to the bottom of this page so the audit trail is public.
- Score deltas larger than 0.5σ trigger an in-app notification to every affected user explaining what changed and why.
8. Limitations and known biases
The honest section. Five things the Alpha Score does not yet do well:
- SDR roles are under-represented. v2 benchmarks were tuned on AE / Mid-Market data. SDR scores ship, but the cohort is thin and the score should be read as directional until SDR sample size crosses N=200.
- Renewal / CSM scoring is intentionally excluded. A renewal deal has a different value structure than a new-logo deal (NRR vs. ARR, retention vs. acquisition). Forcing both into one number would penalize the discipline that does the work. Coverage for CSM-specific metrics is on the roadmap as a separate score, not a patch to Alpha.
- Mid-period comp-plan changes are not handled. If a rep's OTE or quota shifted halfway through the trailing 12-month window, the score treats the period as homogeneous. This penalizes reps who took on a harder territory mid-year.
- Mid-period role changes are not time-segmented. A rep who was promoted from SMB AE to Mid-Market AE 6 months ago is currently scored as one cohort or the other, not as a weighted blend.
- Benchmark distribution is bootstrapped, not live. The current percentile breakpoints (e.g. revenue p50 = $480k, p99 = $4.5M) are seeded from public sources (RepVue, the Bridge Group SDR/AE reports, and Pavilion benchmark surveys). The breakpoint table is refreshed quarterly. A live-cohort overlay (per-segment SMB / Mid-Market / Enterprise breakpoints, computed from verified-rep data) ships once any segment crosses N=200 verified reps. Until then, every score should be read as "relative to published industry benchmarks" rather than "relative to verified peers on the platform."
9. How to dispute a score
Email hello@winsabove.com with your user ID and the specific deal IDs you believe are misclassified (wrong type, wrong close date, wrong owner, wrong segment). The founder reads this inbox; expect a reply the same week. Manual reviews shape the next version of the algorithm. Most of what shipped in v2 came from disputes filed against v1.
10. Audit trail
Every score recomputation writes a row to the alpha_scores table with a computed_at timestamp. A scoring_version column ships with the v3 algorithm migration so old scores stay attributable to the math that produced them. Until then, the version is implicit in the deploy history. Want your full score history, every component score, every flag that fired? Email hello@winsabove.com with your user ID. We mail back a JSON within a business day.
Questions on the methodology? Email hello@winsabove.com. Source for the canonical scoring engine: lib/scoring/alphaV2.ts.
Related: /benchmarks for the live cohort report, /glossary for term definitions.
Further reading
Adjacent reading from The Tape and the glossary.
Average AE Quota Attainment in 2026: What the Data Says
We analyzed verified CRM data to find real quota attainment benchmarks by segment, deal size, and experience level.
BenchmarksSDR-to-AE Ratio in 2026: Verified Benchmarks by Segment, ACV, and Motion
The median B2B SaaS SDR-to-AE ratio in 2026 sits at 0.8 SDRs per AE, but enterprise teams run 1.8:1, SMB runs 0.4:1, and PLG-heavy orgs run under 0.2:1 — here's the full verified breakdown.
BenchmarksSales Engineer Compensation in 2026: Verified OTE by Segment, Level, and Industry
Median Sales Engineer OTE in 2026 is $215k for an enterprise SaaS IC, split roughly 75/25 base/variable, with cybersecurity SEs clearing $260k and SMB SEs landing near $155k, based on RepVue, Pavilion, Bridge Group, and ICONIQ data.