Read studies carefully

You can do it! You don’t have to be a PhD researcher or a rocket science. You merely need to read carefully and thoughtfully and perhaps occasionally look up a word or two. I say this because writers in the popular press clearly don’t read the studies they report on (or at least read them carefully). I think many, especially as appear in HuffPost, merely cherry-pick from headlines what they want to hear.

So here is a study we’ll look at a bit. Unlike many studies this is not hidden behind a paywall so you can get the entire thing. I saw a link (plus an opinionated statement of what the study said) in a HuffPo slideshow with the usual bias that your health will be wonderful if you just eat enough kale. So let’s deconstruct a bit of it.

First the full title:

A Prospective Study of the Association Between Quantity and Variety of Fruit and Vegetable Intake and Incident Type 2 Diabetes

Sounds relevant. So what is a “prospective study”? Not hard to find, here’s a reasonable definition in Wikipedia. So now you know what that is.

Next, let’s look at some of the preliminary prose in the abstract:

OBJECTIVE: The association between quantity of fruit and vegetable (F&V) intake and risk of type 2 diabetes (T2D) is not clear, and the relationship with variety of intake is unknown.

This is an important statement. These authors are claiming the that the answer to this question isn’t known. Presumably they’ve done a literature search before beginning their study so we’ll take this at face value. Note that this study was submitted for publication in 2011 AND THEREFORE statements about F&V prior to that time ARE NOT based on EVIDENCE (or else these authors wouldn’t need to do this study, except to confirm or contradict previous results). Imagine how many years nutritionists have been telling us about F&V without any evidence.

Next look at the conclusion they present in the abstract:

These findings suggest that a diet characterized by a greater quantity of vegetables and a greater variety of both F&V intake is associated with a reduced risk of T2D.

I’ve underline two words in quoted passages above: association and suggest. These are important words. Scientists, unlike pop writers, are conservative in their claims (or at least good ones are), so they don’t say their study “proved” anything, just that it suggests something. And what is the something? Causation, correlation, or assocation. An association is the weakest possible statistical connection between variables. Even correlation is stronger. Recall that correlation has been found between the length of women’s skirts and stock market prices. In short, if you crunch data enough you can find all sorts of nonsensical “correlations” (a correlation is just a simple statistical test, and in fact, correlation itself, even when the r^2 is provided doesn’t tell you very much).

Causation is what matters. If you’re going to do something, based on the outcome of a study, you want to see causation, not association, not even correlation. Causation has to take the correlation statistical analysis further and demonstrate (at some threshold value) that the correlation cannot be due to mere chance. There are standard methods of doing this and good studies will burble on about the statistical tests they did to justify their belief the correlation is “statistically significant”. But causation is much tougher, because it means you have to rule out everything else (but the correlated independent variable) AND you must confirm your results by running the null test, that is, eliminate the correlated input and determine that the outcome is completely changed. You also have to demonstrate that the variable being correlated is truly independent, i.e. it’s not just the result of something else. So, for instance, many consumers of F&V might be vegetarian and thus perhaps the correlation you’re finding is merely due to not consuming meat, not the F&V themselves. In short, causation is a very high threshold and rarely met in any nutritional studies.

But let’s go on with deconstructing this study, now that we know its goals are modest and that whatever its results, this is still not PROOF of the putative claim that F&V will prevent type 2 diabetes.

How did they do the study? Any good study will provide considerable detail on their methodology. If you don’t find that detail, discard the study. But then read, carefully, with just a little logic and commonsense what they did:

The final sample for analysis consisted of 653 incident T2D cases and a subcohort of 3,166 individuals (including 115 incident T2D cases).

This bit was reported in the HuffPo article, IMO, to try to show that the conclusion is based on a fairly significant sample size (I’ve seen other studies reported where the number of participants is a few dozen – sure, right, that will tell you something, but thousands, well, that should be enough, right?). But let’s look more closely:

From the 25,639 participants in EPIC-Norfolk at baseline, we ascertained incident cases of T2D (n = 892) and selected a random subcohort of 4,000 participants. This subcohort was representative of the entire EPIC-Norfolk cohort in terms of age, BMI, education level, physical activity level, smoking status, and total energy intake …

Something is fishy in Norfolk, folks – can you see what it is?

They claim they selected a random subsample, BUT: In the entire population of 25,639 participants there are 892 with T2D, or 1 in 29, but in their subsample of 3166 there were 115 cases, or about 1 in 27 (not quite random). But of course, the numbers of cases don’t quite add up, so something is unclear. But they excluded a bunch of people:

Of the 4,749 participants, we excluded those … with prevalent myocardial infarction stroke, or cancer were also exclude (n = 400). Oh, got rid of the sickies, eh?

those with fewer than 7 days of diary data (n = 435) or who did not return a diary (n = 15), Oh, got rid of the lazies, eh?

Now these exclusions are reasonable, but they aren’t random. But so be it, we’ve now got our 3166 people. What are we going to study about them? Recall they said “We examined the 11-year incidence of T2D“, so 11 years of studying these people, that sounds pretty good, but, here’s what they did:

In brief, participants residing in Norfolk, England, were recruited … and attended a baseline health check. Follow-up of participants constituted a postal questionnaire at 18 months, a second health check in 1998–2000, and a further postal questionnaire in 2002–2004.

Wow, twice they got a postal questionnaire in the mail and did one health check in 11 years! That’s really close observation of these people, isn’t it. But even more importantly:

At the baseline medical examination, participants were instructed by trained interviewers on how to complete the 7-day food diary. Completed diaries were returned by post to the coordinating center at the University of Cambridge.

Now this single 7-day diary is the ONLY data collected from these people about their F&V consumption. So it’s an 11-year study with data, at least about food consumption, from only one week. AND, the diary was returned in the mail so no one actually interviewed these people (just to see if they did the diary correctly) and certainly no one actually observed what they ate. Now self-reported nutritional data is known to be very very poor data. People are lazy and don’t do a good job on the diary, people make mistakes, and most of all people lie (trying to look good since someone is watching them). So out of 572 weeks of the study duration we have 1 week of dubious data. And from this all the rest of the conclusions flow.

Now here’s a place where they’re being careful (more detail I didn’t quote):

Participants who gave a self-report of history of diabetes that could not be confirmed against any other sources of ascertainment were not considered as a confirmed case of T2D.

So we can be relatively sure of the T2D incidence data, it’s the dietary data that is dubious. There is a lot of tedious (but important) detail explaining how the diaries were analyzed, but this bit is interesting:

Variety of fruit, vegetables, and combined F&V intake was derived by calculating the total number of different items consumed at least once in a 1-week period, irrespective of quantity of intake.

So if these participants ate one grape or one green bean, wow, that’s “variety”. As to the total consumption, well, that hasn’t been explained by this point in the article. Notice also they’re ignoring whatever else they ate, so for instance, if the non-consumers of F&V also ate greasy fast food burgers, well, we’ll never know that, or if the consumers of F&V ate little else, we’ll never know that. IOW, F&V as a fraction of the total consumption is NOT being considering in the analysis of the results. What about fat, sugar, booze, carbs? What about exercise level? What about total calories? From other studies, it’s fairly well known that high consumers of F&V tend to be “health conscious” and generally have “good” calorie intake and exercise levels, so perhaps it is those factors that matter more than F&V consumption. Oh my, certainly one reason any conclusions, at most, are just an association.

The statistical analysis section is well explained (although frankly somewhat overkill for the poor data), but as an exercise for the reader, note this tidbit:

all P values >0.32

So now what about conclusions and the commentary:

After accounting for potential confounding factors and the effects of quantity of intake, each different additional two item per week increase in variety of F&V intake was associated with an 8% reduction in the incidence of T2D.

Sounds impressive, but does it make any sense. Remember only 1 out of 27 people showed T2D, IOW, around 100 people. And so a mere two items has the 8% improvement! Think about it a minute, exactly how was this conclusion reached? Is it credible? Is it statistically significant?

These folks are scientists, so they’re honest and note what they report:

Previous epidemiological studies have reported inconsistent findings for an association between quantity of F&V intake and risk of T2D. Two separate meta-analyses have reported no overall association between fruit, vegetables, and combined F&V intake and diabetes risk,

IOW, they’re willing to pass on the information that other studies have different (and null) results. See how many politicians will do that! Or religious types? But is this just a ploy to gain credibility for this study? If so, why is this study reaching conclusions when the others aren’t? Sample sizes, perhaps? Quality of data, perhaps?

And here’s their answer:

in the EPIC-Norfolk study,mean consumption of F&V was much higher when assessed by FFQ (6.5 portions per day) than by a food diary (3.8 portions per day) (11). For this reason, FFQs are not ideal for examining adherence to, or for informing, public health guidelines

FFQ, btw, is

food frequency questionnaire (FFQ) to assess F&V intake, which is suitable for ranking individuals according to their relative but not absolute intake

yet, in terms of one of their conclusions in this study they ignore quantity when assessing the benefit of variety – let’s be consistent, fellows.

The authors carefully go through an argument why they think their study is better than others and it sounds convincing enough, so we’ll just grant that, but so what? Does this study actually show anything?

But let’s dig in a little more in science (biochemistry, not vague food diaries). They say:

The biological mechanisms for the inverse associations of F&V intake and diabetes risk are not clear.

Good, again I’ll compliment these authors are openness and honesty (unlike the quick and overly-broad summaries in popular press). But let’s look at what they hypothesize (meaning guess about the biochemistry, since in fact, it is actually unknown, since opening up peoples’ small intestines to see what is going on isn’t likely to get IRB approval).

A plausible biological mechanism to explain the beneficial effect of quantity of F&V intake on T2D is via the low energy and high fiber content of F&V, and as such their ability to reduce the overall energy content of the diet. It has previously been demonstrated that those who consume the highest quantity of F&V, in comparison with low consumers, have a lower risk of weight gain (24,25), a major risk factor for diabetes (26). A decreased risk of T2D with increasing quantities of vegetable intake in particular may be explained by the fact that vegetables are generally consumed with other foods as part of a meal and therefore may displace or buffer the harmful effects of deleterious foods from the diet, such as energy-dense foods or foods that increase the risk of T2D

Note the use of the word plausible, meaning certainly possible and logical, but completely unproven (otherwise they could say proven). But what are they saying here? This is exactly the same argument in favor of whole grain bread over pure white flour bread – FILLER! If you eat F&V (or cotton balls or anything that would fill you up and clog your intestines) then you don’t get the bad stuff. Fine, F&V do the exact SAME THING as eating less would. I’ve made this argument about bread – eating 15% less white bread has exactly the same effect as eating whole grain bread, no magic, just less calories. Plus, they’re as much as saying it’s the weight gain from excess calories that drive the T2D, not any magical secret spiritual ingredient in F&V. So eat anything that isn’t the bad stuff and you’d get exactly the same benefit. So if F&V fill you up and keep you from getting hungry and eating too much, terrific, they’ve done their job. But that IS what F&V do – cut your calories.

And then they go on to say:

Alternatively, higher consumption of specific vegetables, particularly green leafy vegetables, might reduce the risk of T2D due to the presence of relatively high concentrations of potentially beneficial bioactive compounds

Ah, the magic of kale! But note what they DO NOT say – they do not claim they made any measurements of this “plausible” explanation. In their methodology they’d didn’t split out leafy green vegetables. They didn’t measure anything about this. And, being honest, they don’t claim anything – they just hint at it. IOW, they repeat the “conventional wisdom” about the wonders of green leafy vegetables with NOT A SHRED of PROOF. So the old saying, repeat some claim often enough and people will believe it. A lot of people say they’ve seen UFOs, so they must exist. A lot of people read astrology, so stars must rule our destiny. Sorry, popularity of an idea is NOT PROOF.

And finally they say:

These findings support current public health recommendations encouraging consumption
of F&V as part of a balanced diet and place particular emphasis on the important and independent role that both quantity and variety in F&V intake may play in helping to prevent the development of T2D.

Big surprise – they go along with the crowd. Now proof of crowd view would be interesting and contradiction would be even more interesting, but really their statement about all this is very timid, certainly not deserving the headlines HuffPo gave this study, a done deal, all wrapped-up, scientifically-proven solid answers.

And then there is this:

This study was supported by grants from the Medical Research Council, the Food Standards Agency, Cancer Research UK, and the British Heart Foundation.

In general this is good. At least, unlike the Mediterranean Diet study, that was paid from by an olive oil company and the California Nut Association (guess what that study found was wonderful) they don’t have any corporate sponsorship. But they don’t say what portion of their funding came from the underlined agency, which, of course, has a vested interest in this conclusion. I’d be a lot more impressed if their funding came from Beef Growers or McDonalds or beet sugar association, where the study would be biting the hand that feeds it, that being funding by the nutrition “establishment”.



