The most straightforward approach to measuring position in a dataset is called simple ranking. Basically you just sort the elements by value for some variable of interest, and the element's position in that rank order is the simple ranking.
Example: 2020 Tour de France
In races, we typically rank participants by time. In the 2020 Tour de France, the riders achieved the following times and corresponding ranks:
|1||Tadej Pogačar (SLO)||UAE Team Emirates||87h 20' 05"|
|2||Primož Roglič (SLO)||Team Jumbo–Visma||+ 59"|
|3||Richie Porte (AUS)||Trek–Segafredo||+ 3' 30"|
|4||Mikel Landa (ESP)||Bahrain–McLaren||+ 5' 58"|
|5||Enric Mas (ESP)||Movistar Team||+ 6' 07"|
|6||Miguel Ángel López (COL)||Astana||+ 6' 47"|
|7||Tom Dumoulin (NED)||Team Jumbo–Visma||+ 7' 48"|
|8||Rigoberto Urán (COL)||EF Pro Cycling||+ 8' 02"|
|9||Adam Yates (GBR)||Mitchelton–Scott||+ 9' 25"|
|10||Damiano Caruso (ITA)||Bahrain–McLaren||+ 14' 03"|
Ties introduce a slight complication. There are different strategies for dealing with ties. (You can read more about them on Wikipedia.) Here we'll assume standard competition ranking, also known as "1224 ranking". This means that if there's a tie for a given rank, they both get the "higher" rank (i.e., the rank with the lower number), and there will be a gap between the tied elements and the next element or elements.
Example: Dealing with ties in a charity fundraiser
In a charity fundraiser, the top five teams raised the following dollar amounts:
- Team Gorilla: $3,750
- Team Harlequin: $3,390
- Team Lighthouse: $4,300
- Team Viper: $4,175
- Team Zzyzx: $4,175
Teams Viper and Zzyzx tied. Using standard competition ranking, the ranks are:
- 1st place: Team Lighthouse ($4,300)
- 2nd place: Team Viper, Team Zzyzx (tie) ($4,175)
- 4th place: Team Gorilla ($3,750)
- 5th place: Team Harlequin ($3,390)
Strengths of simple ranking
Easy to understand. Everybody understands simple ranking. We use it all the time in everyday life.
Weaknesses of simple ranking
Harder to calculate when the dataset is large. It's one thing to apply simple ranking to bicyclists in a race, but quite another to apply simple ranking to, say, height across all people in the United States.
Hard to compare simple ranks across datasets. Knowing that one person was 2nd place in a race and another was 5th place in a different race isn't really enough information to make a comparison. Getting 2nd place in a race with four competitors is probably far less impressive than getting 5th place in a race with 2,500 competitors, for example.
Exercise 1. Apply a simple ranking to the members of your family based on a variable of your choosing.
Exercise 2. Choose a different variable, and re-rank your family members according to this new variable.