Applied Nate Silver – modeling a True 450 treadmill

This exercise took far longer than I had expected and finally after three days of data gathering and iterative analysis I’m finally willing to call it quits and present the results.

In researching how I should be exercising for weight loss I learned two things about the various formula:

  1. the daily caloric requirements (under various scenarios of exercise level and weight lost targets, from various online calculators) have a dependence on current weight
  2. the amount of calories burned in a workout have a dependence on current weight

Thus one is screwed twice over as weight declines: a) you need less calories, b) you burn less calories in exercise. I’ve noticed this before, instead of a linear decline (given relatively constant diet) loss is higher at the beginning and then tapers off. I haven’t seen these in my “money charts” (previous posts, e.g. most recent of this series). But I have the feeling I’m about to start seeing that tapering off effect.

So reading various sources it appears what I should really concentrate on, as measurement of workouts, is, Metabolic_equivalent(MET),

MET is used as a means of expressing the intensity and energy expenditure of activities in a way comparable among persons of different weight. Actual energy expenditure (e.g., in calories or joules) during an activity depends on the person’s body mass; therefore, the energy cost of the same activity will be different for persons of different weight. However, since the RMR is also dependent on body mass in a similar way, it is assumed that the ratio of this energy cost to the RMR of each person will remain more or less stable for the specific activity and thus independent of each person’s weight.

So how can I get the METs of my workouts? On the True 450 treadmill I use the display panel will show the instantaneous MET value, based on current incline and speed. But most of my workouts involve varying both incline and speed over the workout (usually the recommended “warmup” and “cooldown”, plus some profile, often “sprints”). So METs is changing during the workout and the treadmill doesn’t report MET-hours cumulatively. So what to do? The treadmill does, however, report its notion of calories burned for the entire workout. Now the formula for computing this depend on weight, age, gender, and age, but the True only wants input of weight (it must assume default values for the others). Already I realized, in my 30 months of stats, that I was getting strange variation in calories burned and realized that weight programmed in the treadmill was changed (other people use the machine), so as it is I had an “fudge factor” (crudely derived) so treadmill could be left at a constant weight setting and then in my spreadsheet I generate calories burned for my weight (which was relatively constant over most of the 30 month history, but recently has been changing).

Anyway, I thought it would be relatively easy to channel the spirit of Nate Silver and do a more accurate data gathering and analysis to get a formula for conversion of calories burned (adjusted to hourly amount, thus independent of actual workout time) to MET-hours. But it turned out to not be so easy.

The simple approach would be to do some workouts at various fixed settings (speed and incline, and thus METs) for relatively brief times and plot these and to remove the measurement error (my assumption is that the formula in the treadmill is totally deterministic). This results in the graph below (has more data than when I first did it and realized the problem):

true1

As you can see for a particular value of METs (x-axis) there is considerable variation in calories/hour (y-axis), as shown in more detail below:

true1A

Here’s we’re showing the calories/hour (y-axis) for various tests performed where the duration of the test (in hours) is show along the x-axis. I fitted a logarithmic curve because that actually makes fundamental sense. The issue is this: The treadmill does not start running at the indicated speed and slope. In fact, I have to push its control buttons (which takes time) to get the desired settings and the machine has to come up to speed (which takes time). That means the first bit of the sample is distorted because the machine is rising (takes around 20-40, depending on settings) from zero METs to the test value. One could assume this is “linear” and so the average is about half the final value. So, for instance, if the test is 5 minutes and startup is 30 seconds, the average would be: ((4.5*1) + (0.5 *0.5))/5.0 = 0.95%, but for a twenty minute test it would be ((29.5*1) + (0.5*0.5))/ 30 = 0.9917 and eventually asymptotically approach 1.0.

Now I don’t want to run the test for many hours (so the startup effect is mostly eliminated), so I reasoned there was another approach. For each setting (speed and incline, thus a particular METs value) I could do multiple tests, then plot time and calories, and compute Δcalories/Δtime, which of course is the slope as would be found with a linear regression fit. My first shot at this produced the following charts:

true2

The raw data from various settings with regression line shown for series 6 (5.5 METs) and

true2A

The graph (above) is the summary of all the tests where the slope (Δcalories/Δtime, normalized just to calories/hour) is plotted for various MET values. Thus the regression line is the formula I wanted.

OR IS IT?

need some help here, Nate!

Note that the r^2 is not 1.0 and you can notice some obvious deviation from a pure straight line just by eyeballing the graph. Ooops! This means I don’t have a completely accurate answer.

So what to do?

So I had another thought (plus I collected another day’s worth of data to see if things got better, they didn’t). If I use just the maximum value (calories/hour) of each test and get a regression of that it should be very close (I assumed) to the chart above and maybe I could just “average” the two results to get my formula. So here’s that chart:

true3

As expected the regression line through all points (various fraction of the test is “startup”) and just through the highest value (which, btw, was always the longest test time) are different. If you used these two formula to convert a typical workout’s measured calories you get a difference in METs of around 0.3, which is non-trivial. But furthermore either of these putative “formula” is quite different than the formula based on the Δcalories/Δtime approach, so somehow “blending” these doesn’t seem to make much sense.

So, when in doubt, get more data – right, Nate? And perhaps get the data over a wider range – right, Nate? So that’s what added yet another day to my testing and here’s the result of that:

true4

Series 1 is my data from first day, series 2 is data from second day where I got more results for Series 1 MET values + one additional MET value (9.7). That new value looked “suspicious” (can I say such things, Nate, or am I deceiving myself) and since it was the largest value I thought it might distort the curve. So Series 3 is my final test (sick of this project) with even more expanded range and more data.

The -5.5483 y-intercept of the Series 3 result is sticking in my mind since somewhere on the net, while looking for formula I saw one that had a -5 in it, so naturally (despite having no fundamental reason to believe y-intercept should be anything except zero) I’m inclined to go with it. So here’s my final result:

  1. take the measured total calories and adjust it by total workout time to get calories/hour, CPH,
  2. plug that into: (CPH+5.5483)/87.435 to get METs

So, for instance, yesterday’s workouts (various intervals at various settings) was: 77minutes23seconds and 678 calories, or 525.7 calories/hour, so therefore, 6.076 METs, on average for the workout. Or, IOW, 7.836 MET-hours. Hurrah, I have a value.

Whether the formula is quite right or not may be secondary since often my purpose in getting all these statistics is to look at trends, so there relative values is just fine even if absolute values are off a bit. Plus I can always use a bit of a fudge factor to make sure I’m above any threshold (for instance, the indicated 10MET-hours/week needed for basic fitness).

Now in doing net searches I found a calculator for this (no formula, just an interactive web page, and not for my specific treadmill). Presumably this calculator is just a formula and therefore has no measurement error, so if I “simulate” my actual experiment with just this calculator I should confirm my methodology, right?

So, here’s that result:

true5

Whoa, doggies! This result isn’t “perfect” either. The r^2 is not 1.0 and you can see that points deviate from the line significantly, which means the “formula” derived by this approach isn’t quite right either (how “right” it is I don’t know how to determine unless I could find the formula underlying this website).

So why can’t I get an “exact” result?

During my testing on my actual treadmill I just had this feeling I was seeing some effects of roundoff error, even though I don’t have any data to prove it.

The calculation of METs from speed and incline probably produces a result with many more digits than the single 1/10th shown on the machine’s display (or on the webpage). I don’t know this but I believe it’s likely to be true.

And while watching calories tick by, one at a time while on the machine, I often had the feeling the Δt between each tick was not the same. Now quite probably that’s true. The machine is probably doing its calculations based on some internal clock or possibly on revolutions of the belt or ???  Either way this might be different than when it updates the display and that also might have some roundoff. So some times, it’s plausible at least, I see one tick on the calorie display a bit quicker than some other time.

So, all of this means that the various data I’m getting from the machine are not quite precise and therefore is the basis on this uncertainty in my formula. Given this testing has already taken a bunch of time, not to mention all the time spent in analysis and writing up these results, I don’t think I’m going to try to use what Nate would do (or at least like) and get more data.

So, again I’m completely on the wrong track here because what Nate would do is work with probabilities, from his trusty Bayes formula. So I could probably apply that approach and put some probability on whether my measured formula actually matches the real formula in the machine. Fine, that’s his basic message, but in this case, my reaction is, who cares. The trouble with viewing things as probabilities is that this doesn’t apply to a world with no  (or few) repetitions, unlike baseball statistics or poker. There was only one 2012 election so what does it really mean to say Obama would have won 90% of the time (he either won or didn’t in the one time it mattered). And my need for a formula requires an actual formula, even if wrong, not some formula that randomly varies a bit, day by day. OTOH, once I compute METs with my formula and am then analyzing some data I should realize there is uncertainty in my data and thus do some stochastic simulation (my preferred approach over Bayesian) and thus present those results as a probability.

So, did I get this part right, Nate? Grade, please.

Addendum:

While actually doing the first bit of my workout today I realized I could easily collect data to demonstrate the point that the readout (of METs) on the True 450, with only one decimal place is misleading. A way to demonstrate this is to get the indicated METs for the entire range of inclines at a single speed (2.7 mph in this case). Here is that plot:

true6

Now the first thing is the slope (ΔMets/Δslope) is an uneven number (in this case also equivalent to not being a “rational tangent”). Each increase in slope of 0.5 (the minimum increment) would result in 0.3714/2 = 0.1857 METs. You can see, in the actual reported METs (starting with 0 slope) this pattern, a Δ of 0.2 in some steps, a Δ of 0.1 in others [3.1 3.3 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.7 4.9 5.1 …]. You can actually see this effect in the graph, a series of points starts below the regression line, then rises above it, then falls below again.

Now if any of you were old enough to have lived in the times when we had to actually write code for graphics (rather than just use libraries) you’d recognize this as the Bresenham algorithm for rasterizing (convert math to pixels) the linear equation for the line.

All this just demonstrates one particular source of error (that I can easily measure), namely that the actual displayed value for METs on True 450 is not exact. The other item, too hard to measure, is watching calories tick one by one and measuring the gap in seconds between each tick and I believe that would display similar behavior but I don’t have any easy way to do that (maybe use electronic stopwatch with lap timer, but my reflexes (not to mention attention span) would probably add too much error).

Oh well, such is life. Empirical measurement, even of a precise and deterministic underlying equation, is fraught with error.

And silly scientists are trying to determine if the observed universe (and thus us) is a simulation by detecting errors since any stochastic simulation has some amount of precision error. Good luck, fellows, let me know what you find out.

 

Advertisements

About dmill96

old fat (but now getting trim and fit) guy, who used to create software in Silicon Valley (almost before it was called that), who used to go backpacking and bicycling and cross-country skiing and now geodashes, drives AWD in Wyoming, takes pictures, and writes long blog posts and does xizquvjyk.
This entry was posted in musing and tagged , . Bookmark the permalink.

5 Responses to Applied Nate Silver – modeling a True 450 treadmill

  1. Nona says:

    I don’t share your blog with my dieting husband, he thinks I make it hard enough already!

  2. I just run outside every morning and trust my body to figure out how many calories I’ve burned… 🙂

    • dmill96 says:

      Smart move, the body knows. I used to do the same thing (bicycling instead) but that was in a place where I could be outside without freezing or suffocating. Now condemned to the artificial exercise indoors it takes all these numbers and stats to overcome the boredom and thus keep going.

  3. Pingback: What is the point of blogging? | dailydouq

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s