Saturday, December 15, 2012

Are you measuring the right stuff?

(heavy geek alert!)

We have a saying in analytics: "The easier it is to measure the less valuable it probably will be." This isn't always the case, but a large chunk of the time it will prove to be true. Counts especially are one of my least favorite measures (as any and all of my co-workers will attest!). Case in point: We were building a new datamart at the bank. We had a group reporting that they had successfully loaded data for all but 3 customers. In other words, out of maybe 80k - 100k, only 3 had issues. The ratio was so great, hell, probably not even worth tracking down the others, right? I mean it would take more time than it was worth. Our set was good enough for management reporting... or was it? As the duly appointed namer of elephants in the room, I asked which 3 had not loaded. Two were insignificant, but one was the bank's 4th largest customer and by law had to be reported to Federal agencies. Uh, hmm, maybe this issue is a show stopper.

The moral here is that the easy metrics, the count, the loaded ratio, are worthless. When matched with the actual requirement even missing 1, if it was the wrong one, was a fatal error. Conversely, you could probably drop 30k of the smallest and still be fine. The measure needed to be based on something else, in this case customer ranking in terms of the bank's total credit exposure.

So how does this carry over to you, the athlete? Pace, distance, number of workouts (a count), even HR, all taken in isolation tell you very little about what you really want to know: am I getting better? They need to be taken in combination, cleansed for outlying activity, etc... My Garmin 305 tells me the average pace for my run. Today it told me 7:30/mi. A week ago it told me 7:06/mi. So I'm getting worse? Obviously we need to know more. Last week that 7:06 was over 7 miles. This week the 7:30 was over 15 miles. So was it better or worse? Uh maybe, but if I'm honest I have no idea. Sort of apples and oranges. So we standardize on workouts of a similar duration and compare again. A month ago I went 7:57 for 1:25. Today 7:30 for 1:52. Okay looks like I'm doing better. But honestly when building a metric I still like more relevance. 

Personally I sometimes have a hard time getting going on Saturday morning (it's that pushing 47 years old thing!), so I'll take the first 30min of the run and just go easy, whatever easy is for that day. This creates an outlier in the paces. In other words that first 30min is not indicative of where I am overall, just how I'm feeling that day. So for longer runs I tend to focus on average effort (HR) and paces AFTER the first 30min (in one mile segments). My run score measure is based on the relationship of pace, HR and time. And rather than focusing on the actual score, again something heavily influenced by things like a rough week at work, I'll focus on the slope of the score for the miles after the first 30min (Garmin autolap at 1 mile). 

So put more simply, the measure for long/endurance runs is as follows:
The change in run score after the first 30 min. The training scorecard is the trending of the average change in the run score measure over time.

I get that this is not in everyone's wheel house, so to speak. Hopefully it is something your coach would understand. I know many who do. And if your's doesn't? Well hmm, do you know any good data analysts? :)

PS: I also love trending average score for the final 3 miles of a long run over time (weeks/months). Correlates to the ability to finish out a long event!