As I said, I'm a geek. Using the tail of a distribution is not the usual way of doing a polynomial trend analysis. Parametric statistics rely on the assumption that you have a normal distribution of data, and, assuming the data will ever reach a normal distribution--if they won't, this method shouldn't even be tried--the more data the better, and if you cut anything, you cut the tails. I could see doing it this way if you could generate a lot of elite rower data, but with one data point per year (of age), that won't happen.PaulH wrote:teampbandj - I'm unsure why the fastest times are unreliable - they are, so far as is publicly known, the fastest times that a person of that age/weight/gender has ever done.
Yes, it will include "not trying," but that will actually not matter if most people are trying. That's the good thing about lots of data--if behaviors are rare, they wash out. If they are common, they will matter, but they will also reflect what the population is "really" doing.In contrast a collection of times logged for each event would include times from people who didn't 'try'
Which is quite reasonable. Perhaps C2 would consider this? Wouldn't cost them anything but a bit of bandwidth... Hello, C2?Having said all that, the bigger objection to using all the data is simple - C2 (very reasonably) don't release it in any bulk form, so collecting it would involve a screen-scraping application that I lack the knowledge to create.
I am not a stats wiz either. I have a couple-three classes and have to run analyses at work on a more or less regular basis, but I am not a statistician, and could be off, but I think my reasoning goes through.All - It's true I'm not a stats wiz, and neither are most people, so I'm trying for a method that's relatively easy to understand for all. "Take all the best times from the rankings, and draw a line through them" serves that purpose well, though I could make a much better job of explaining it on the site if real life didn't insist on so much of my time
And I agree, basically, that drawing a line (well, a curve, if you're doing second order), makes a lot of sense. It's just that drawing the line through as much data as possible is generally the best thing, and if you are going to limit the data, you don't really want to keep the tails, the fastest and the slowest, you want to keep the middle.
Note that this would still reward the fastest people, the upper tail, the most. They would lose nothing; but the rest of the people would get a more reliable measure of their own relative ability.
That said, I am thrilled that anyone actually is interested in doing this at all, much less doing it for the benefit of rowers anywhere. I don't want to come off as negative; I am just geeking out that someone would even do this, and think a core dump of C2 would be way cool.