How does WinsAbove calculate sales performance scores?

WinsAbove computes empirical percentile rank for each performance metric (revenue, win rate, deal velocity, pipeline coverage) against an industry benchmark distribution, then applies a probit transform to convert each percentile to a z-score. The weighted sum of z-scores is mapped to a 0 to 5 Alpha Score. The benchmark distribution is currently aggregated from public 2024-2025 industry sources including Bridge Group SaaS AE Metrics, Pavilion B2B SaaS Performance Benchmarks, Gong State of Sales, ICONIQ Growth State of Go-to-Market, and Optifai 2025 SaaS Benchmarks. Once a segment crosses 200 connected reps, scoring switches to the connected-rep distribution.

What CRM systems does WinsAbove integrate with?

Salesforce and HubSpot via read-only OAuth. WinsAbove pulls Opportunity records, OpportunityHistory (or HubSpot deal stage history), and engagement signals (meetings, emails) for trust verification. Access is read-only — the platform never writes back to your CRM.

Why does WinsAbove use percentile rank instead of raw quota attainment?

Quota is set internally and varies wildly by territory, comp plan, and management decisions — a 110% attainer in a SF enterprise patch is not comparable to a 110% attainer in a SMB greenfield territory. Percentile rank against a peer cohort (matched on role, deal size, industry, geography) gives a market-relative number that is comparable across companies.

What is the minimum data required for an Alpha Score?

Six or more closed-won deals and twelve or more months of CRM history. Below those thresholds, individual deals carry too much weight and the percentile rank becomes unstable. Reps with insufficient data see a "score pending" state instead of a published number.

How does WinsAbove prevent rep performance scores from being inflated or gamed?

Three layers. CRM access is read-only so reps cannot edit their way to a higher score. Dedupe logic detects and removes house accounts, marketing pass-throughs, and split deals. Anomaly detection flags outlier deals (unusually large, unusually fast, missing stage history) for review before they enter the cohort.

How does cohort normalization work?

Each rep is matched to a peer cohort along four axes: role (SDR, AE, AM, Enterprise AE, Sales Engineer), deal size band (ACV $5k to $5M+), industry vertical (SaaS, FinTech, HealthTech, DevTools, Hardware, Services), and geography (NA, EMEA, APAC, LATAM). Percentile rank is computed within the cohort, so an enterprise SaaS AE in NA is compared only to other enterprise SaaS AEs in NA.

WinsAbove

What is Alpha For Recruiters For Teams Pricing Log in Book a demo

METHODOLOGY

WinsAbove Methodology

Specification for the Alpha Score. Where the data comes from, how the number is computed, what gets penalized, and what we have not yet built.

Last updated April 25, 2026Algorithm v2

Book a demo Jump to the math

1. What the Alpha Score measures

The Alpha Score is a normalized, market-relative measure of sales rep performance computed from CRM data, segmented by role, segment, and tenure cohort. It is a composite of six metrics, each scored against an empirical percentile distribution and converted to a normal-z equivalent, then weighted and transformed onto a 0–5 display scale where 2.5 represents the cohort median. A score of 4.0 means roughly one standard-deviation equivalent above peers in the same segment. A score of 1.0 means roughly one standard-deviation equivalent below.

2. Data sources

Every connection is OAuth, read-only. We never write to the rep's CRM and we never store credentials with write scopes. The 4 surfaces we read:

Salesforce via the Merge.dev unified CRM integration. We read opportunities, accounts, deal stages, and owner attribution. No contacts, no notes, no pipeline forecasts.
HubSpot via the same Merge.dev shape. Same surface, same scopes.
User-uploaded artifacts (W-2s, signed comp plans, commission statements) processed by the workers/scorer/ service using Workers AI vision OCR. These artifacts are used for verification badges only. They do not currently feed back into the Alpha Score itself.
User profile for role (AE / SDR / AM), segment (SMB / Mid-Market / Enterprise), and quota, self-reported on onboarding. We have a planned migration to derive role and segment from LinkedIn instead of self-report.

3. Eligibility and filters

Only deals that pass these gates contribute to the score:

Closed-won only. No committed, no best-case, no forecast. Pipeline does not count toward Alpha.
Trailing 12 months from the sync timestamp. Older deals roll off the window.
Parseable close dates. Records with malformed or missing close-date fields are dropped, not imputed.
Tenure-aware. Reps with under 90 days of CRM history, or with an unknown firstDealDate, are flagged provisional and excluded from public leaderboards. Their score still computes. It just ships with a label.
Role and segment classification comes from the user profile today. Mis-classified profiles produce mis-cohorted scores (see Section 9 on disputes).

4. The math (Alpha v2)

Plain English: for each of six metrics we look up where the rep falls in the WinsAbove benchmark distribution (an empirical percentile), convert that percentile to a normal-z equivalent, weight the six z-values, sum them, scale onto a 0–5 line, and subtract any sandbagging penalty. The exact mechanism, pulled from lib/scoring/alphaV2.ts:

Per-metric percentile rank

p = empirical_percentile(value, BENCHMARK_DISTRIBUTIONS[metric])
z = inverse_normal_CDF(p)   // probit transform

The benchmark distribution is a set of percentile breakpoints (p10, p25, p50, p75, p90, p95, p99) sourced from public sales-industry data and reconciled against WinsAbove's reference cohort. Linear interpolation between breakpoints gives the rep's performance percentile; the probit transform converts that percentile to a normal-z equivalent so the weighted-sum machinery downstream stays interpretable in standard-deviation terms.

Velocity (average deal cycle in days) is inverted before lookup because a shorter cycle is better: a 21-day cycle sits at raw p10 of the cycle-time distribution, but represents top-10% performance, so the function returns performance percentile 0.90.

Composite weights

The 6 components and their current weights in v2:

Revenue (trailing 12 months): 35%
Win rate: 20% (gated by volume multiplier)
Velocity (avg deal cycle, inverted): 15%
Average deal size: 15%
ICP alignment: 10%
Hunting ratio (new-business revenue share): 5%

Volume multiplier

Thin sample sizes get penalized so a rep with 2 lucky deals does not outscore a rep with 40 grinding ones. The threshold for full credit scales with tenure: 5 deals if under 6 months, 8 deals from 6–12 months, 10 deals after 12. The multiplier is clamped at a 0.3 floor so a ramping rep is not zeroed out entirely.

Sandbagging penalty

If the user-entered close date sits in an earlier quarter than the system-stamped close date (i.e. the deal actually closed in Q1 but was backdated to Q4 of the prior year for quota), each flagged deal subtracts 0.1 from the final score, capped at a 0.3 total penalty.

Final scaling

alphaScore = 2.5 + (weighted_z × 1.5) − sandbagging_penalty
alphaScore = clamp(alphaScore, 0, 5)

2.5 is the cohort median. Each 1.5 of weighted Z translates to roughly one full point of Alpha. The percentile shown on the profile uses the same weighted Z and a normal-curve approximation: percentile = 50 + (weighted_z × 34), clamped 0–100.

Why empirical percentile rank, not parametric Z-score

Sales rep performance is heavily right-tailed (Pareto-shaped). A parametric Z-score against a fixed mean and standard deviation reports a $5M-revenue rep as roughly +15 standard deviations above the mean, which is mathematically nonsensical for non-normal data. Under empirical percentile rank, the same rep correctly lands at the 99th percentile (≈ +2.33 standard-deviation equivalent under a normal curve). The downstream weighting and the 0–5 final scale stay calibrated, but the long tail no longer breaks the math. The benchmark distribution is updated quarterly against the WinsAbove reference cohort; the next refresh will incorporate segment-specific breakpoints (SMB / Mid-Market / Enterprise) rather than a pooled distribution.

5. What gets penalized (anti-gaming)

Every shipped penalty corresponds to a specific CRM gaming pattern. Things we have not yet shipped are listed honestly so nobody assumes coverage we do not have.

Hunter substring exploit (shipped). v1 used type.includes('new') to count new-business deals, which silently matched the string "Renewal" and inflated Hunter scores for AMs. v2 uses an explicit allowlist ("new business", "new logo", "new", "acquisition", "initial sale") and hard-rejects any string containing "renew".
Sandbagged close dates (shipped). We compare the user-entered close date against the system-stamped close date. If the system date sits in a later quarter than the user date, the deal is flagged and the rep takes a 0.1-per-deal penalty (capped at 0.3).
ICP misalignment (shipped). Each segment has a deal-size band (SMB: $0–50k, Mid-Market: $25k–150k, Enterprise: $100k+). Closing exclusively below your band looks like cherry-picking and trims up to 0.2 off the ICP factor. Closing above your band looks like selling up and adds up to 0.2. Final ICP factor is clamped 0.8–1.2.
Volume inflation via deal-splitting (planned, not shipped). One large deal logged as 8 sub-deals to inflate count is not yet detected. Shipping target: q3 2026.
Connection-shopping (planned, not shipped). Reps who sync only their cherry-picked CRM org and quietly omit others are not yet caught. The plan is owner-email cross-match against undisclosed CRM tenants. Honest disclosure: no ETA.

6. Refresh cadence

Today, scores recompute on user-initiated CRM sync. A rep clicks "Sync now" and the score reflects every closed-won deal up to that moment. The planned upgrade is a nightly cron recompute that stamps each result with a scoring_version and a computed_at timestamp, so two reps viewed on the same day are guaranteed to have been scored against the same algorithm and the same benchmark snapshot. The nightly cron is not yet shipped. If you see a stale score, a manual sync will refresh it.

7. Versioning policy

Algorithm changes follow 4 rules:

Any change to weights, benchmarks, or filter logic that materially shifts published scores triggers a major version bump (v2 → v3).
Old scores remain stamped with the version that generated them. We do not silently rewrite history.
This methodology page is updated within 7 days of any version change. A dated changelog of every revision is appended to the bottom of this page so the audit trail is public.
Score deltas larger than 0.5σ trigger an in-app notification to every affected user explaining what changed and why.

8. Limitations and known biases

The honest section. Five things the Alpha Score does not yet do well:

SDR roles are under-represented. v2 benchmarks were tuned on AE / Mid-Market data. SDR scores ship, but the cohort is thin and the score should be read as directional until SDR sample size crosses N=200.
Renewal / CSM scoring is intentionally excluded. A renewal deal has a different value structure than a new-logo deal (NRR vs. ARR, retention vs. acquisition). Forcing both into one number would penalize the discipline that does the work. Coverage for CSM-specific metrics is on the roadmap as a separate score, not a patch to Alpha.
Mid-period comp-plan changes are not handled. If a rep's OTE or quota shifted halfway through the trailing 12-month window, the score treats the period as homogeneous. This penalizes reps who took on a harder territory mid-year.
Mid-period role changes are not time-segmented. A rep who was promoted from SMB AE to Mid-Market AE 6 months ago is currently scored as one cohort or the other, not as a weighted blend.
Benchmark distribution is bootstrapped, not live. The current percentile breakpoints (e.g. revenue p50 = $480k, p99 = $4.5M) are seeded from public sources (RepVue, the Bridge Group SDR/AE reports, and Pavilion benchmark surveys). The breakpoint table is refreshed quarterly. A live-cohort overlay (per-segment SMB / Mid-Market / Enterprise breakpoints, computed from verified-rep data) ships once any segment crosses N=200 verified reps. Until then, every score should be read as "relative to published industry benchmarks" rather than "relative to verified peers on the platform."

9. How to dispute a score

Email hello@winsabove.com with your user ID and the specific deal IDs you believe are misclassified (wrong type, wrong close date, wrong owner, wrong segment). The founder reads this inbox; expect a reply the same week. Manual reviews shape the next version of the algorithm. Most of what shipped in v2 came from disputes filed against v1.

10. Audit trail

Every score recomputation writes a row to the alpha_scores table with a computed_at timestamp. A scoring_version column ships with the v3 algorithm migration so old scores stay attributable to the math that produced them. Until then, the version is implicit in the deploy history. Want your full score history, every component score, every flag that fired? Email hello@winsabove.com with your user ID. We mail back a JSON within a business day.

Questions on the methodology? Email hello@winsabove.com. Source for the canonical scoring engine: lib/scoring/alphaV2.ts.

Related: /benchmarks for the live cohort report, /glossary for term definitions.