Elo scoring works from the bottom individual responses generate behavior ratings, which aggregate into Skill scores, and roll up into a Capability score. Here’s how:
Step 1: Your Responses → Item-Level ratings
- Each task is tied to one or more specific behaviors - the observable indicators Elo uses as evidence. Your response is scored by AI using rubrics created by domain experts.
- Each response receives a 1–4 item-level rating:
- 4/4 — Well done (fully correct)
- 3/4 — Nearly There (mostly correct, minor gaps)
- 2/4 — Getting There (partially correct, important gaps)
- 1/4 — Not Quite (incorrect or insufficient evidence)
✨ Elo uses partial credit, so responses that are mostly correct still contribute proportionally to your Skill score.
Step 2: Skill Profile ➔ Capability Score (1–300)
Item-level ratings tied to the same Skill aggregate into a Skill score. Each Skill is shown on a 1–4 displayed rating with a continuous underlying score that can move within a rating band.
Every Skill score also includes an AI-generated rationale explaining what drove the result, citing specific Behaviors as evidence where relevant.
| Score Range | Band | What it Means |
| 0–99 | Beginner | You’re building foundational knowledge |
| 100–199 | Developing | You show emerging capability |
| 200–300 | Accomplished | You demonstrate consistent proficiency |
📊 How it’s calculated: Your Capability Score is calculated from your underlying Skill scores, each of which aggregates evidence from multiple Behaviors. Skills contribute proportionally, so partial-credit evidence flows through the full scoring chain.
--