I'm wondering how it handles the partial minutes in the data...
First of all, everything that's posted uses only instances of 48 minutes, so not a concern that it could be skewing the data.
As for the effect of minutes, yes, showing it like age is the goal. The problem is controlling for everything. E.g., you could show it for 18yo's getting Pressure PG training, but that's so specific that we won't have enough data. You can't really group too many things, because the basic rates change a lot as a function of age, skill and training type.
The real solution, which I've started playing with, is to create a single model that can explain all the data, then doing parameter fitting to figure out exactly how much things matter. The problem with this is that you need to have a decent guess for what the equation should be. I mean, it's easy if things are linear, but minutes are almost definitely not, so you need a way to figure out whether it's best modeled as a decaying exponential, a power law, or something else you haven't even thought of. Which is why I've been trying to start with plots like the age one, it gives a really good sense of what that component should look like.
My current idea is to guess something close and do an analysis of residuals (looking at the difference between the actual and expected data) and seeing if you can use that to find a better model. Unfortunately, this is a bit of a computationally intensive process, so I haven't found a way to do it "live" in php yet (and the method is still *very* rough). And it's tough to combine across a lot of training types. But that's a sneak peak at the long term aim.
Some other good stuff in there that i'll think about wolph.