Keep doing that and you'll go blind

Statistical failure of A Population-Based Case–Control Study of Extreme Summer Temperature and Birth

Guest Post by Willis Eschenbach

The story of how global warming causes congenital cataracts in newborns babies has been getting wide media attention. So I thought I’d take a look at the study itself. It’s called A Population-Based Case–Control Study of Extreme Summer Temperature and Birth Defects, and it is available from the usually-scientific National Institutes of Health here.

two-way radiation between lightsFigure 1. Dice with various numbers of sides. SOURCE 

I have to confess, I laughed out loud when I read the study. Here’s what I found so funny.

When doing statistics, one thing you have to be careful about is whether your result happened by pure random chance. Maybe you just got lucky. Or maybe that result you got happens by chance a lot.

Statisticians use the “p-value” to estimate how likely it is that the result occurred by random chance. A small p-value means it is unlikely that it occurred by chance. The p-value is the odds (as a percentage) that your result occurred by random chance. So a p-value less than say 0.05 means that there is less than 5% odds of that occurring by random chance.

This 5% level is commonly taken to be a level indicating what is called “statistical significance”. If the p-value is below 0.05, the result is deemed to be statistically significant. However, there’s nothing magical about 5%, some scientific fields more commonly use a stricter criteria of 1% for statistical significance. But in this study, the significance level was chosen as a p-value less than 0.05.

Another way of stating this same thing is that a p-value of 0.05 means that one time in twenty (1.0 / 0.05), the result you are looking for will occur by random chance. Once in twenty you’ll get what is called a “false positive”—the bell rings, but it is not actually significant, it occurred randomly.

Here’s the problem. If I have a one in twenty chance of a false positive when looking at one single association (say heat with cataracts), what are my odds of finding a false positive if I look at say five associations (heat with spina bifida, heat with hypoplasia, heat with cataracts, etc.)? Because obviously, the more cases I look at, the greater my chances are of hitting a false positive.

To calculate that, the formula that gives the odds of finding at least one false positive is

FP = 1 – (1 – p)N

where FP is the odds of finding a false positive, p is the p-value (in this case 0.05), and N is the number of trials. For my example of five trials, that gives us

FP = 1 – (1 – 0.05)5 = 0.22

So about one time in five (22%) you’ll find a false positive using a p-value of 0.05 and five trials.

How does this apply to the cataract study?

Well, to find the one correlation that was significant at the 0.05 level, they compared temperature to no less than 28 different variables. As they describe it (emphasis mine):

Outcome assessment. Using International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM; Centers for Disease Control and Prevention 2011a) diagnoses codes from the CMR records, birth defect cases were classified into the 45 birth defects categories that meet the reporting standards of the National Birth Defects Prevention Network (NBDPN 2010). Of these, we selected the 28 groups of major birth defects within the six body systems with prior animal or human studies suggesting an association with heat: central nervous system (e.g., neural-tube defects, microcephaly), eye (e.g., microphthalmia, congenital cataracts), cardiovascular, craniofacial, renal, and musculoskeletal defects (e.g., abdominal wall defects, limb defects).

So they are looking at the relationship between temperature and no less than 28 independent variables.

Using the formula above, if we look at the case of N = 28 different variables, we will get a false positive about three times out of four (76%).

So it is absolutely unsurprising, and totally lacking in statistical significance, that in a comparison with 28 variables, someone would find that temperature is correlated with one of them at a p-value of 0.05. In fact, it is more likely than not that they would find one with a p-value equal to 0.05.

They thought they found something rare, something to beat skeptics over the head with, but it happens three times out of four. That’s what I found so funny.

Next, a simple reality check. The authors say:

Among 6,422 cases and 59,328 controls that shared at least 1 week of the critical period in summer, a 5-degree [F] increase in mean daily minimum UAT was significantly associated with congenital cataracts (aOR = 1.51; 95% CI: 1.14, 1.99).

A 5°F (2.75°C) increase in summer temperature is significantly associated with congenital cataracts? Really? Now, think about that for a minute.

This study was done in New York. There’s about a 20°F difference in summer temperature between New York and Phoenix. That’s four times the 5°F they claim causes cataracts in the study group. So by their claim that if you heat up your kids will be born blind, we should be seeing lots of congenital cataracts, not only in Phoenix, but in Florida and in Cairo and in tropical areas, deserts, and hot zones all around the world … not happening, as far as I can tell.

Like I said, reality check. Sadly, this is another case where the Venn diagram of the intersection of the climate science fraternity and the statistical fraternity gives us the empty set …

w.

UPDATE: Statistician William Briggs weighs in on this train wreck of a paper.

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

153 Comments
Inline Feedbacks
View all comments
rgbatduke
December 20, 2012 6:49 am

Jason T, it is good to see that they did a better analysis than appears “at first blush” as you put it, but they ignored the elephant while focusing on the mouse. There is nothing in science that is more dangerous than fishing for hypotheses in some vast pile of highly multivariate data. It is, in fact, a rather complicated logical fallacy.
The elephant involves — as Willis (and I) point out — simply looking at accurately known prevalence rates drawn from large populations that live all the time at temperatures that vary by 5F or more. Of course one doesn’t expect those prevalence rates to match perfectly even for large populations drawn from different parts of the world, because the population genetics is generally different, the background of confounding variables like immunization, exposure to teratogenic substances, radiation levels (which are a function of height above sea level, latitude, and specific environmental factors local to particular places) are all different. So even if they were different for these huge populations one’s utter inability to control for the confounding variables (most of which are simply unknown and unknowable) makes it impossible to resolve a temperature-based difference, although one can place an upper bound on it quite easily. A very, very small one.
To do better, one would require all the data in the world. Literally. Then one could coarse grain it (splitting the globe up with some appropriate tesselation of the sphere), visit all the tessera, measure or estimate all of the confounding factors and the mean temperature, and do god’s own fully multivariate statistical analysis. Ultimately, you’d need to show that temperature has unique explanatory power not confounded by mere accidental covariance with e.g. ultraviolet exposure in tropcial climates, pollution in industrializing third world countries like India (which is currently phenomenal) and so on.
As for the author’s argument that this is somehow “plausible”– humans are homeothermic animals. This means that it does not matter as a general rule what the temperature outside your body is — inside, it is 98.6F plus or minus half a degree F. When it is higher, we say that the individual has a fever. When it is lower, we say that the individual is hypothermic. Both are serious conditions, no doubt, but fevers occur routinely and no doubt occurred throughout the two populations throughout pregnancy at a more or less normal rate, which makes it rather probable that individuals in the populations had a fever for at least a few days while pregnant.
And here again, we have a very, very simple statistical test, one that the authors would surely have investigated if they actually gave a damn about being accurate rather than creating hysteria. For example:
http://www.healthline.com/health-blogs/fruit-womb/fever-pregnancy
What? Fever (hyperthermia) during pregnancy is associated with birth defects? Damn skippy it is! And what is the most common cause?
Over the years, several studies have confirmed that temperature elevations in pregnant women accompanying influenza and common cold virus infections are associated with greater risk for congenital anomalies, multiple and isolated, especially, neural tube defects.
Wait, wait, did he say neural tube defects? Not congenital cataracts? So that if the women in question actually had elevated body temperatures (hyperthermia) from having been in a warmer climate for one whole week of their pregnancy (snowbird trips to Disney World during pregnancy now being counterindicated) then it is almost certain that an increase in neural tube defects would have been detected before an increase in congenital cataracts?
Oh, and while we are at it, why not look in the literature at the connection between actual fever-induced hyperthermia and congenital cataracts? Let’s see:
http://eyewiki.aao.org/Congenital_and_Acquired_Cataracts_in_Children
The causes of infantile cataracts have been the source of much speculation and research. Making a distinction between unilateral and bilateral cataracts may be useful when considering etiology.
The majority of bilateral congenital or infantile cataracts not associated with a syndrome have no identifiable cause. Genetic mutation is likely the most common cause.

and
Systemic associations include metabolic disorders such as galactosemia, Wilson disease, hypocalcemia and diabetes. Cataracts may be a part of a number of syndromes, the most common being trisomy 21. Intrauterine infections including rubella, herpes simplex, toxoplasmosis, varicella and syphilis are another cause.
Then there is this one:
http://www.ncbi.nlm.nih.gov/pubmed/16323161
(from 2005). Conclusion: Some isolated congenital cataracts are preventable by rubella vaccination and probably by influenza vaccination in the epidemic period. In addition, our results suggest that using antifever therapy for fever-related respiratory diseases may restrict the teratogenic risk…
although also (and very interestingly):
A higher prevalence of influenza or common cold during pregnancy was found in the case group (55.9%) than in the population control group (18.5%; adjusted odds ratios [ORs], 5.8; 95% confidence interval (CI), 4.0-8.4) or in the malformed control group (21.7%; adjusted OR, 4.7; 95% CI, 3.2-6.9).
In other words, a fever during pregnancy almost certainly is a causal factor in congenital cataracts! And what are the most common causes for fever? The flu or the common cold, although rubella (in places where MMR vaccination is not routine) or any other illness likely to cause a high fever increases your risk by a factor of three (more if you adjust it to reflect the gaussian distributions).
OK, so this is probably enough. We can see that sustained hyperthermia during pregnancy can very probably cause congenital cataracts as a birth defect outside of its association with familial inheritance or e.g. trisomy of chromosome 21. So, is sustained hyperthermia associated with “heat waves” in the summertime? Do people get a “fever” when it is hot out?
And what about flu and the common cold? Are they more likely when it is hot outside?
People do indeed become hyperthermic during a heat wave — primarily the very old and very young. Hyperthermia is defined in this case as a body temperature over 100F (sustained) without having a fever, that is, without an external cause such as a virus. It isn’t normal — as I said, people are homeothermic and unless they are stressed or exposed in an unrelenting way to higher temperatures (such as working out in the sun, engaging in outdoor sports in the sun) without access to water or shade on a high-humidity day they tend to stay under 100F even when it is hot out. Extended exposure leads to “heat exhaustion” or “heat stroke”, with symptoms very similar to those of correspondingly high fevers.
However, pregnancy is not listed as a risk factor in getting heat exhaustion when it is hot out. It isn’t an implausible one — women are under stress, their surface to volume ratio changes, their baseline metabolic rate can be higher — but surely heat exhaustion would show up in the individual case histories of the people in the study, not just “exposure” to high temperatures. Where I live (North Carolina) it is always five degrees farenheit warmer than it is in upstate New York (where I also lived for a long time). But here we have air conditioning, or fans, or the sense not to sit outside in the sun on a hot and humid day when we are pregnant (if we can avoid it, and most can not to protect their baby but because being hot and sticky is uncomfortable).
Then there is the flu/cold factor. According to the CDC (see e.g. here:
http://www.cdc.gov/flu/weekly/index.htm#MS
one is hundreds of times more likely to get the flu in the winter months than the summer. The cold (also associated with the birth defect, recall) follows a very similar pattern — summer colds are rare, winter colds are commonplace. In fact, almost all the respiratory infectious diseases seem to peak in the wintertime. One’s chance of having a fever in association with these diseases is therefore many times higher in the winter than in the summer, which one would expect to partially or completely cancel any general hyperthermia-linked bump in the summertime.
In other words, even if the perfectly reasonable hypothesis that having an elevated core temperature for an extended time while pregnant is teratogenic and could result in a variety of birth defects, with neural tube defects leading the way, the risks of febrile infections peaks in the colder months and must at least partially counterbalance the risk of hyperthermia due to excessive summer heat. Indeed hyperthermia is a lot less likely to be sustained for periods longer than a few hours, while fevers can last for days, so one would expect it to be a smaller factor.
Of course, in both cases simple common sense measures can provide protection. Get a flu shot before you get pregnant, and avoid sick people while you are pregnant. If you get sick anyway, use an approved anti-febrile agent to reduce your fever until you get better. If you are pregnant, it isn’t a good time to play tennis on a sweltering day, and you need to drink plenty of fluids.
That takes us down to the final bit — linking this to climate change. First of all, there isn’t any evidence that the climate is “changing” at this particular moment. The weather every year is somewhat different, yes, there are heat waves and cold spells. If you look at the cdc data above, you’ll see that there were eight times as many pediatric deaths from flu in the cold winter of 2009-2010 than there were in last year’s mild winter, just as there were more deaths of heat stroke during last summer than there were in many a comparatively more mild summer in years before. Weather extremes of any sort cause distinct problems, and weather extremes happen somewhere all the time and everywhere some of the time. There is no evidence that these extremes are changing their distribution. There is no evidence that the climate is “currently” warming, where the meaning of the word currently with respect to climate is a rather sticky question (averaged over just what window).
But the bottom line is that the paper above produces a weak statistical link — one that inside the paper it is acknowledged to be too weak to form any sort of conclusion, because they saw the incidence of a different sort of birth defect go down in the same population at the 0.05 significance level, and it is as absurd to assert that the warmer weather was protective of one as it is causative of the other. But the paper title says otherwise, and it has already been grabbed and dumped onto environmental sites as “proof” of one more danger of CAGW. No doubt this will get added to the absurd epidemiological studies that claim to tell us how many additional people die every year “because of AGW” (count the begged questions) while they never seem to subtract the people that lived “because of AGW”, like the 250 extra children who didn’t die of the flu in 2011-2012 compared to 2009-2010, or the people who didn’t starve to death because of a premature frost in farm country.
Given the flaws, the paper should not have been accepted, not with its title. Basically what they showed is that there are no statistically significant correlations between “heat waves” and birth defects visible in a study of this size. This is strongly supported by a two-minute analysis of prevalence rates in different climates plus the fact that neural tube defects (the “coal mine canary” as it were) should have the greatest sensitivity to hyperthermia from all causes and was not observed to bump, something they failed to look at and that forms a powerful Bayesian prior that still further reduces the probable significance of their result. Their title claims otherwise.
Sadly, their title is understandable. Nobody wants to publish null results (even though, as Feynman pointed out, often they are the most valuable results to publish). Who will fund further work if you don’t get anything the first time around? And here, they get to tie their work to not one, but two demons — birth defects and the horrors of Global Climate Change. Funding for more detailed work is assured, and if that future work produces a null result, not a single soul in the lay population will ever hear a word about it — the urban legend is already established and this paper will never die.
rgb

December 20, 2012 6:51 am

Scarface says:
December 20, 2012 at 12:27 am
I always thought people were homeotherm.
And that an unborn baby would be growing in a steady 37C environment.
How could a baby in the womb notice any change in temperature outside?
And how would it affect him?
================================================================
It’s a sighting issue.

Jimbo
December 20, 2012 6:57 am

Doesn’t the temperature of a mother’s womb stay roughly the same even if there is a 5-degree [F] increase in summer????
Causes of congenital cataracts.
http://www.nlm.nih.gov/medlineplus/ency/article/001615.htm

David A. Evans
December 20, 2012 6:59 am

The Harvard Nurse Health Study is another data dredge.
Nothing worthwhile ever comes out of that one either.
DaveE.

Gail Combs
December 20, 2012 7:01 am

E.M.Smith says:
December 19, 2012 at 8:53 pm
Just Amazing…
Another case of “Climate Science” done by folks who took one Stats class, then forgot most of it….
>>>>>>>>>>>>>>>>>>>>>>
I doubt they even took a statistics course at all depending on the school. Probably some prof. spent a day going over how to use a stat. computer program and that was it. (Based on my undergrad required courses in chemistry and the one day intro to the field of statistics in an analytical chemistry class.)
For example a Biological Sciences major at Illinois State University (link) requires NO STATISTICS at the undergraduate level but does require a course in Ecology and Biological Diversity.
In the masters program you finally get one course **BSC 490/420.27 Biostatistics/Biostatistics Lab – 4 Credit Hours and an elective of BSC 450.37 Advanced Studies in Biostatistics – 3 Credit Hours
BSC 490/420.27 Biostatistics/Biostatistics Lab
This is a graduate course introducing students to applied statistics and data analysis using SAS. The goal is to prepare graduate students for using and understanding common statistical methods in Biological Sciences.
Actual course outline link

COURSE GOALS: This course is an introduction to applied statistics. The ideas and methods discussed will be those most relevant to biologists in general. You will acquire a working knowledge of basic statistical methods, and will be able to determine which procedures are most appropriate for a given circumstance. All of the statistical techniques relevant to biologists cannot be covered in one semester, however, once you have mastered the material in this course, you will be better equipped to understand and use more advanced statistical methods.
In the laboratory portion of this course you will gain experience in the use of the SAS computer package for statistics. There are a number of good statistical packages available, and some of you may already know how to use some of these. I will give examples and explain how to do things in SAS, and all of you will do the assignments using SAS. By learning enough about general aspects of statistical computation and interpretation, you will be able to generalize to other packages if you so choose.

Depending on the teacher this could be just a course on how to plug numbers into a computer with little fundamental information on the correct use of statistics. The reading assignments do look reasonable however.

Area Man
December 20, 2012 7:03 am

There is actually a wonderful opportunity here; of the 28 groups, one showed a predicted increase with increasing temperatures. But it’s likely one or more also showed a predicted decrease (!) of birth defects with increased temperature. A request for the raw data from the researchers will confirm this. If in fact it is the case, one would wonder why the BENEFITS of warming were not reported!
Of course Willis is correct and neither the predicted increase or decrease is likely real, but it would be great to hear the researchers explain away any selective omission of a predicted BENEFIT.

Gail Combs
December 20, 2012 7:18 am

John West says:
December 19, 2012 at 11:40 pm
Forgive me Willis for not sharing in your amusement…..
>>>>>>>>>>>>>>>>>>>>
Occasionally we have to laugh to relieve the stress of realizing how very serious this whole subject is. It keeps sceptics from going postal. (High moral standards also do that)

Gail Combs
December 20, 2012 7:23 am

Scarface says:
December 20, 2012 at 12:27 am
I always thought people were homeotherm…..
>>>>>>>>>>>>>>>>>>>>>>>>>
Yes we are and even in 100F (38C) I normally run 97.0F (36C)
In NC I run into 95F to 100F a lot since I spend a lot of time outside.

December 20, 2012 7:23 am

D Böehm says:
December 19, 2012 at 10:15 pm
Being transparent to RF frequencies means that RF energy is not felt by our bodies’ cells. The cell phone/cancer scare is as fake as the AGW scare.
=============
Water is not transparent to RF. It is the principle behind microwave cooking.
HAM (amateur radio) warns of the dangers of transmitting with an antenna held near to your head. This warning is given repeatedly in bold faced type, in all certification manuals and at all levels.
The problem with new technology is that we have not been genetically selected for adaption to it. Thus, by chance some individuals will be killed off by the technology until the gene pool consists mostly of those individuals that are resistant to any potentially harmful effects. A similar experiment with artificial fats introduced during WWII created an epidemic of heart disease in some western countries. Those countries that avoided the artificial fats, such as France, have low incidence of heart disease while eating a high cholesterol diet.
Yet to this day, while many scientists recognize the connection between fat and heart disease, they fail to recognize that only certain types of fats are a problem, and only to those individuals that are genetically susceptible. Those fats that have been eaten for many generations will not be a problem, because our ancestors would have long been killed off by them and we would not be here.
These fats may even have a protective effect, by blocking receptor sites in the body that would otherwise be occupied by artificial fats. Thus, by stressing a diet low in fat, scientists may in fact be giving the wrong advice and the wrong treatment. The Mediterranean diet shows that the cure to heart disease may well be to eat more fat rather than less, so long as the fat is the same as what your ancestors ate.
There are many parallels between this situation and climate science. Faulty scientific conclusions based on faulty use of statistics, with a cure that may be more harmful than the disease.

Gail Combs
December 20, 2012 7:34 am

Alan the Brit says: December 20, 2012 at 1:28 am
….I find it sad that as “ALL CHEMICALS” cause cancer….
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Perhaps we can persuade ALL the Environmental Activists to take up residence in an chemical free environment for a week or so or even a CO2 free sealed environment for a day. (snicker)

Gail Combs
December 20, 2012 7:50 am

Lewis P Buckingham says:
December 20, 2012 at 3:23 am
…Further, quoting animal studies, it should be remembered that congenital cataract formation is a feature of some lines of dogs, especially in the cocker spaniel and has a strong genetic component….
>>>>>>>>>>>>>>>>>>>>>>>>
Just to add, it is very strong in some lines of horses/ponies esp. the Appaloosa. Appaloosas are one of a few breeds that can have cataracts present at birth. and link 2

rgbatduke
December 20, 2012 7:52 am

That the researchers tested 28 independent hypotheses at once rather than publishing 28 different papers is irrelevant to the significance of the individual results since they are, indeed, independent.
Sadly, this exhibits a profound lack of understanding of statistics. For one thing, the hypothesis were hardly independent, given similar mechanisms and an identical hypothesized cause. For another, the paper did present 28 results at once, but its focus was clearly on demonstrating a causal connection, not disproving one.
Look, one has a choice. One can publish a single paper that shows the distribution of p for all 28 tests because you assert that they are effectively independent trials, and perform a Kolmogorov-Smirnov test against the uniform distribution to see if the distribution of p is unexpected given 28 supposedly independent results or one can publish 28 independent papers, one per result, 26 of which are papers that announce a null result (each with its own value of p), one of which announces that summertime heating is associated with a protective effect against one kind of birth defect at the 0.05 level (yes, they in fact found that to be the case) and one of which announces a causative effect at the same level of significance. This did not happen. The paper did not present “the distribution of outcomes in a series of tests was totally normal and identical to what one would have likely gotten using simulation with fair dice and the null hypothesis” because that wouldn’t have been “exciting” and nobody would have read their paper. It was:
Population-based case-control study of extreme summer temperature and birth defects.
synopsized as:
Higher summer heat is associated with increased risk of a rare birth defect that can lead to blindness. A 5-degree increase in temperature was associated with a 51 percent increase in congenital cataracts. The strongest link in this new study was found during a specific time of pregnancy when the eye is developing. The finding could be worrisome given climate change and increases in extreme hot-weather events.
Where is the bit about its association with a reduction of a different sort of eye birth defect? Oh, only deep in the article.
All I can say is given 28 draws, it would have been very slightly remarkable if one of them had not had a 0.05 hit (and they got two, one on each end of things). Given 100 draws, it would have been very remarkable indeed.
Hypothesis testing is one of my primary games. One of the dumbest things about it is that 0.05 is a commonly accepted measure of significance. This is absurd — 1 in 20 chances happen all the time. This just means that, on average, at least 5% of what is published with this as the standard is crap.
The real proportion of crap (in the medical profession) is much higher because the whole medical research establishment is “infected” with rampant confirmation bias and cherrypicking — where failure to produce a KS-test on all the tests or portray the distribution of p at all is a form of cherrypicking — and further distorted by the strong ties between positive results of any sort and fame, fortune and funding at the independent research level, and profit at the corporate research level. If you’ve dropped a billion on developing a drug, you’d better find a population that will pay you two billion for it. If getting tenure or keeping your funding is dependent on your finding something “interesting” as opposed to a null result, well, you’ll find something interesting.
Pick a disease, any disease. Let’s pick something easy, such as getting the flu, which (as it happens) has a known cause. Nay, pick twenty eight such diseases. I don’t really care what they are. Now take a population — and again, I don’t really care what the population is, or how large it is, let’s say 66000 why not. Pick any distinguishable variable you like — whether or not one is left handed, or blue eyed. Do a simple correlation between the binary values of this variable and all twenty eight diseases, and you are likely to find that something — maybe staph aureous, maybe influenza, maybe HIV, maybe chicken pox — is correlated with being blue eyed at the 5% confidence level. That’s what makes this sort of shotgun approach crap!
In fact, take a fair twenty sided die and a common, ordinary unbiased coin. Roll the one a million times, flip the coin a million times. Form the correlation of one with the other. On average (repeating this entire experiment a million times) one of the correlations will be unlikely at the 5% level in every trial.
You cannot infer that the coin and the die are not independent by looking at only this one case, and that’s the first question of relevance here. It is, after all, difficult to assert that the coin and some side of the die are correlated when the data do not suffice to reject the null hypothesis that they are, in fact, independent. One necessarily implies the other.
So you tell me — does the evidence suffice to reject the null hypothesis of heat waves are independent of birth defects? Because if they are causative of a birth defect in particular, the data had better support this first, don’t you think?
But of course, nobody has heard of KS tests unless they actually know what they are doing, and sadly so very few people know what they are doing when it comes to hypothesis testing. Sometimes including actual statisticians — it is a difficult subject where Bayesian analysis and mad skills are often key.
The only good reason to believe that the heat is causative is because there is a secondary correlation with “the time that the eye is developing”, plus the fact that there is a known causative association with hyperthermic conditions produced by fevers. But there is no definitive proof that the individual women in question were, in fact, hyperthermic during their exposure, nor is there any sort of statistically significant difference between prevalence rates in radically different global communities with far greater temperature differences than studied.
Also be aware that we’re talking tiny numbers — expected prevalence in the control population is only 24 (really, a bit less!) and a “fifty one percent increase” is a dozen extra cases, which pushes it barely out to two sigma and a hair. A rule of thumb in statistics is that you need at least a population of 30 before one can really start to rely on the central limit theorem, and most people I know who do this sort of thing would cheerfully extend that to 100.
I’d rather rely on the prevalence rate in south India, where most women go through summertime pregnancy exposed to temperatures that are routinely 10F warmer than those studied in New York and where whole populations live without air conditioning, which is not 51% higher than the prevalence in New York (where there is lots of air conditioning in the summertime besides being a lot cooler), the US, the UK, or anywhere else in the temperate or arctic zone.
Really, don’t you think that more or less squashes the question entirely, at least until the data suggests that external environmental temperature causes birth defects at all, ideally in association with specific cases of noted/measured hyperthermia? The latter I’d believe, given the fever-birth defect linkage already in the literature, but somewhere in there one has to show that pregnant women are almost certain to get hyperthermia in hot summers to make the connection in this paper at all plausible. And this I doubt.
rgb

TonyBerry
December 20, 2012 7:55 am

This is another example of poor meta-analysis from the department of spurious statistics. I spent 30 years in the pharma industry closely involved with clinical studies on drugs. An area were meta-analyses abound! As examples we have all heard of the supposed correlation (or not) between electromagnetic fields and cancer and HRT and cancer. Both the result of many confused meta-analyses with the golden grail of a P<= 0.05 generally not worth the trees they are written on. Good analyses require prospective studies which address the issues of confounding variables and which designed so that the outcome of the study is blinded to interested parties until the trial code is broken. Even in this case P<= 0.05 would be considered borderline and strong correlations would need at least a P<= 0.01. In open and uncontrolled meta analyses studies these spurious P factors are worthless unless they are used to indicate the direction of further real research. This principle also applies to most, if not all, meta-analyses used in climate science which as far as I can see rarely show high statistical correlations and since there is usually no attempt to control or analyze confounding variable are again not worth the multitude of trees, both as subjects of analysis and paper used to write them on. Unfortunately this also applies to both pro and anti global warming camps ( just in case this upsets anybody I'm strongly anti!). This is also the issue which divides Leif from the rest of the anti groups who regularly feature on this website. As I understand it Leif's view is that something highly significant is happening to the sun right now based on his observations however he believes he has yet to see any clear causal parameter which links changes in the sun with the climate on the earth. That's a rigorous position to take but I think it lacks any proposal for the linking mechanism i.e. a testable hypothesis. In the anti-Leif camp correlations abound without any testable hypothesis which might indicate causality with the exception perhaps of the Cloud Experiments which I assume are still on going.
Finally, in the areas of science were meta- analysis abound you find frequent, invalid, attempts to slice and dice the data to get a better fit (retrospectively) like infamous the tree ring studies ( not worth a tinker's cuss) and switching the basis of the analysis.- the hiding the decline method all of which have nothing at all to do with the science and all to do with protecting reputations, maintaining grant funding or political agendas (both non government – the green lobby and government – distraction from the real and difficult issues affecting the state) such is the poor state of climate research. Ho Hum!

December 20, 2012 7:58 am

The main problem is, looking at so many different things one must feel bound to find something. I see lots of life science studies that have marginal or questionable statistical effects. Statistical significance is by definition to in the life sciences it often is essentially meaningless, except to the dogma pushers.

rgbatduke
December 20, 2012 8:08 am

There is actually a wonderful opportunity here; of the 28 groups, one showed a predicted increase with increasing temperatures. But it’s likely one or more also showed a predicted decrease (!) of birth defects with increased temperature. A request for the raw data from the researchers will confirm this. If in fact it is the case, one would wonder why the BENEFITS of warming were not reported!
There was and they were. Sort of. The paper correctly — at some point — notes that the data don’t suffice to show either one. They call — naturally — for more work to be done because of this. But AFAIK they didn’t do a full KS test on the distribution of p from all 28 tests (which wouldn’t have been very reliable, but might have been revealing in its own right even as a population histogram). The most important question is whether or not this distribution is or is not more or less flat. If it is flat enough that the KS test yields a reasonable p-value, one cannot reject the null hypothesis of “heat waves do not cause birth defects” just because one birth defect in the distribution tested came out with any particular value of p.
Here is a really lovely thing I remind people who don’t understand hypothesis testing. It is just as likely to get a p-value in between 0.475 and 0.525 — which nobody would reject — as it is to get a p-value in between 0.00 and 0.05, for any test where the null hypothesis is statistical independence so that outcomes are randomly distributed. You would be precisely as well-justified in rejecting a random number generator, or any other process designed to produce a p-value (that is, if correctly done, a uniform deviate per test) because it gave you a number in the former range as in the latter.
Hence the strong need for KS testing of the distribution of p, or the insistence that a p value be really low to reject the null hypothesis, not 0.05. I don’t even like 0.01. 0.001 is OK, depending on how many samples you plan to draw, but I play with random number generators far more than is healthy for any fully grown man, and hey produce numbers smaller than 0.001 — one thousandth of the time. I like to reject null hypotheses when p from a good KS test applied to the p-values produced by many independent “runs” of some testing process gets down to 10^{-6} or so. That’s starting to be pretty unlikely, and the distribution of p in that case would almost certainly be visibly non-uniform.
Or one can (sometimes) do one’s stats on the cumulated data and get to the same place, even more accurately. In this case one can easily do this by comparing population prevalence between tropical and temperature climate countries. If the paper we are discussing is correct, it predicts that there should be at least a 50% greater prevalence in the former than the latter, absolutely clearly resolved. Furthermore, one should be able to associate the outcome with specific cases of clinical hyperthermia, hyperthermia that somebody actually measures, and relate the surplus to increased risk due to e.g. fevers in a quantitative way.
It isn’t that the claim is falsified by the study. It is that it is absurd to claim that it is verified, and a meta-glance at the overall data suggests that it is unlikely and shouldn’t even be thrown out as anything but a null result without answering a lot of questions that this study does not answer.
rgb

Justthinkin
December 20, 2012 8:14 am

David L says:
December 20, 2012 at 4:18 am
By thecway, is there anything bad that global warming won’t cause?
Well…it does seem to lower the IQ of so called “climate scientists” by about 2 orders,so…oh wait…that is bad.

rgbatduke
December 20, 2012 8:20 am

HAM (amateur radio) warns of the dangers of transmitting with an antenna held near to your head. This warning is given repeatedly in bold faced type, in all certification manuals and at all levels.
I looked at this, in detail. The issue is one of skin depth at the frequency in question, power in the transmitter, the utter lack of resonant transitions (this is not ionizing radiation). A cell phone is about as dangerous as a flashlight held against your head. Less dangerous than a powerful flashlight. Or going out into the sun. The sun is way, way more dangerous — it actually does hit you with dangerous ionizing UV (which is indeed carcinogenic in measured ways) in addition to some 300 to 700 times as much broadband electromagnetic intensity as your cell phone. The skin depth at the frequency in question is around 1 cm, meaning that most of the less than 1 W total power doesn’t even make it inside your skull, and becomes a warming “noise” utterly indistinguishable from your baseline core thermoregulated temperature.
I’m a physicist — if you want me to believe in a cell-phone — cancer connection you’re going to have to show me how the radiation can degrade DNA. It ain’t doing it by heating it. Again, beware results at the edges of “statistically significant” by the paltry 0.05 standard, pulled out of shotgun population studies or advanced by people infected with confirmation bias and with a dog in the race. Beware even then of confounding factors — maybe cell phones leach heavy metals into the air while operating and it is these that are toxic, or people who own cell phones are likely to drink coffee decaffienated by rinsing with an organic solvent that remains residual in the coffee and causes cancer. Correlation isn’t causality, and causality has to be physically plausible if not verified. Show me cell phones degrading DNA at standard powers from 2\pi steradian solid angle radiation at a distance of 2 cm and through a layer of watery skin and bone.
rgb
REPLY: I agree. As a broadcaster, we don’t have such warnings mandated for people who work around TV/radio transmitters where ERP is in the tens to hundreds of kilowatts range. If there were issues at VHF/UHF frequencies and power we routinely deal with, surely the FCC would have mandated warning labels. Microwave frequencies typically will cook your eyes before your brain (a standard warning from my days working on S band and C band weather radars) with pulsed ERP’s in the megawatt range, so I tend to laugh at the worries over cell phone and WiFi router radiation int he milliwatts range. I’m also an amateur radio operator, and with the exception of a few worry worts in my circle, few care about the issue. Note that police and firemen carry the same sort of equipment daily, and if there was some provable cancer causality, you can bet the police and fire unions would be all over it for improved health care/risk benefits, but they aren’t – Anthony

rgbatduke
December 20, 2012 8:30 am

TonyBerry says:
December 20, 2012 at 7:55 am

…a bunch of stuff I agree with so strongly that it makes me weep to read it. I swear, nobody understands either statistics or quantum mechanics (which is kind of like complex statistics, which might explain the latter difficult).
Nothing helps you see the light regarding hypothesis testing more than writing something like dieharder, which is an engine for doing nothing but hypothesis testing. Play with dieharder and a perfectly good random number generator and you’ll quickly come to appreciate Marsaglia’s poignant observation that “p happens”. Read some of my remarks in the documentation and you’ll see why one cannot produce “certified random numbers” by producing sequences that pass the whole battery of tests, because an ensemble of such numbers will always fail the tests, they cannot possibly be random (you’ve cut out the tails!). Read about why you expect a handful of tests to fail on any given run at the 0.01 level (there are hundreds of tests that generate p values in a run) — and then there are the broken tests.
Then we can talk about predictive modeling in strongly multivariate nonlinear milieu. Oh. My. God.
rgb

rgbatduke
December 20, 2012 8:45 am

One final reply that says it all — in fact, pulled from the link on the wikipedia page on data dredging — that could have been 100% of Willis’ article above as one series of pictures is worth a few thousand words:
Obligatory XKCD — jellybeans cause acne. Pardon me, only green jelly beans cause acne.
http://imgs.xkcd.com/comics/significant.png

December 20, 2012 8:55 am

Gail Combs says:
December 20, 2012 at 7:18 am
John West says:
December 19, 2012 at 11:40 pm
Forgive me Willis for not sharing in your amusement…..
>>>>>>>>>>>>>>>>>>>>
Occasionally we have to laugh to relieve the stress of realizing how very serious this whole subject is. It keeps sceptics from going postal. (High moral standards also do that)
==================================================
A gag a day helps prevent gagging.

December 20, 2012 9:26 am

Given that Lucy was found in the Rift Valley I would think that any problem with heat or heat waves would have been bred out of us along time ago.

highflight56433
December 20, 2012 10:18 am

“However, our findings for congenital cataracts must be confirmed in other study populations.”
India: millions spent on cataract surgery, zero spent on sunglasses.

Matt G
December 20, 2012 10:28 am

How did this pass PEER REVIEWED, shows how dogma works.
Alarmist’s do remember that many hot regions are much warmer than elsewhere, so this globe warmer than x in future is always a fail from the start. These hot regions are here now, so x,y,z in future is already shown now that it’s false. Will you ever learn, of course not while dogma exists and there is no science involved supporting these conclusions.

John West
December 20, 2012 10:41 am

Gail Combs says:
“Occasionally we have to laugh to relieve the stress of realizing how very serious this whole subject is. It keeps sceptics from going postal. (High moral standards also do that)”
I reckon so. I guess this farce can’t go on forever and we’ll have the last laugh. Sorry to have been a wet blanket, perhaps I’m low on cholecalciferol, I’d better get outside for a bit.

michaeljmcfadden
December 20, 2012 10:46 am

Some excellent discussion above of p values and their meaning. 🙂 Heh, if I took a pair of dice and rolled snake eyes on my first roll, should I label the dice as biased? They’ve obviously produced a result with a rather small probability of occurring by chance after all!
One thing to guard against that I believe you have not yet been hit with: moving the acceptable p value for public policy decisions from .05 to .1 — doubling the chances of the result being due to chance. Don’t get too comfy though: the EPA did it for “passive smoking” in their 1992 Report that ushered in US smoking bans and they can just as easily do it for studies regarding climate change. All they have to do to legitimize it is ignore studies going in the contrary direction and declare that it is “agreed” that causal results (of something like CO2 and warming) can only go in one direction. That will then allow them to use a “one-tailed” analysis, doubling the p-value. The media tends to be uneducated in such things and will just take the word of “the authorities” as being correct: a 90% finding will then be hailed as “statistically significant proof” and the hole will be very difficult to dig your way out of.
rgbatduke: Very interesting info on solar radiation! I’ve used the “less dangerous than sunshine” analogy many times in my writings on smoking bans by comparing the need to “protect the workers” in outdoor dining patio situations: after all, as the true believers like to say, “Why should THEY be the only workers forced to risk their lives for a paycheck? (Outside venues) are neither inherent nor necessary for dining or drinking. Serviced patio dining needs to be banned!” But when I’ve made the argument it was based mainly on risks of skin cancer. I had no idea that there might be an even more bogus cell-phone-connected argument to make! LOL! Hey, does cell phone use contribute to global warming???
:>
MJM, Cell phone and sunshine exterminator! Reasonable rates. Termination with extreme prejudice incurs extra charges. Ask about our “80% statistical significance special” (available only on alternate Tuesdays.)