Nonathlon Scoring Equations?

teampbandj · Post by **teampbandj** » May 2nd, 2007, 9:57 pm

PaulH wrote:teampbandj - I'm unsure why the fastest times are unreliable - they are, so far as is publicly known, the fastest times that a person of that age/weight/gender has ever done.

As I said, I'm a geek. Using the tail of a distribution is not the usual way of doing a polynomial trend analysis. Parametric statistics rely on the assumption that you have a normal distribution of data, and, assuming the data will ever reach a normal distribution--if they won't, this method shouldn't even be tried--the more data the better, and if you cut anything, you cut the tails. I could see doing it this way if you could generate a lot of elite rower data, but with one data point per year (of age), that won't happen.

In contrast a collection of times logged for each event would include times from people who didn't 'try'

Yes, it will include "not trying," but that will actually not matter if most people are trying. That's the good thing about lots of data--if behaviors are rare, they wash out. If they are common, they will matter, but they will also reflect what the population is "really" doing.

Having said all that, the bigger objection to using all the data is simple - C2 (very reasonably) don't release it in any bulk form, so collecting it would involve a screen-scraping application that I lack the knowledge to create.

Which is quite reasonable. Perhaps C2 would consider this? Wouldn't cost them anything but a bit of bandwidth... Hello, C2?

All - It's true I'm not a stats wiz, and neither are most people, so I'm trying for a method that's relatively easy to understand for all. "Take all the best times from the rankings, and draw a line through them" serves that purpose well, though I could make a much better job of explaining it on the site if real life didn't insist on so much of my time

I am not a stats wiz either. I have a couple-three classes and have to run analyses at work on a more or less regular basis, but I am not a statistician, and could be off, but I think my reasoning goes through.

And I agree, basically, that drawing a line (well, a curve, if you're doing second order), makes a lot of sense. It's just that drawing the line through as much data as possible is generally the best thing, and if you are going to limit the data, you don't really want to keep the tails, the fastest and the slowest, you want to keep the middle.

Note that this would still reward the fastest people, the upper tail, the most. They would lose nothing; but the rest of the people would get a more reliable measure of their own relative ability.

That said, I am thrilled that anyone actually is interested in doing this at all, much less doing it for the benefit of rowers anywhere. I don't want to come off as negative; I am just geeking out that someone would even do this, and think a core dump of C2 would be way cool.

PaulH · Post by **PaulH** » May 3rd, 2007, 1:23 am

C2 have been asked, and gracefully declined.

Carl Henrik · Post by **Carl Henrik** » May 5th, 2007, 4:37 pm

teampbandj wrote: It's just that drawing the line through as much data as possible is generally the best thing

I think your idea with the multiple regression is interesting, enough to raise thoughts and skepticism that I won't mention.

In the end, or beginning perhaps, though, it's all about what goal of the analysis suits which people, which culture. The current set up is consistent with a goal of allowing comparison to the best, not the average, and not for prediction of what people will do. It's just a comparison to previous best. For some clique of people this will be the most motivating setup, and to me it sounds like a very rower-minded set up, that should find a good base of people enjoying it.

I think your take on analysing this stems from common goals in other areas, but not necessarily the best ones here.

johnlvs2run · Post by **johnlvs2run** » September 8th, 2008, 9:32 pm

I've completely revised and updated the Perathlon tables, based on the 2k world records as usual, and this time additionally on the Concept2 2008 final rankings for the other 9 events.

The result of this is that the 9 events on either side of the 2k will get significantly higher scores than before.

If there are any questions, I'm completely open to sharing how the various formulas are completed.

http://johnlvs2run.wordpress.com/perathlon/