Oreskes on statistics
Naomi Oreskes' article in the New York Times the other day, in which she called for use of 90% rather than 95% confidence intervals, seems to be generating quite a lot of interest.
The best review is here:
Oreskes wants her readers to believe that those who are resisting her conclusions about climate change are hiding behind an unreasonably high burden of proof, which follows from the conventional standard of significance in significance probability. In presenting her argument, Oreskes consistently misrepresents the meaning of statistical significance and confidence intervals to be about the overall burden of proof for a scientific claim:
Ouch.
(This is also relevant)
Reader Comments (41)
Poor Prof Oreskes. She is a victim of the climate scaremongering. Like some other victims, she has become a promoter of the scare herself, and clearly is on the look-out for ways to make the alarm bells about our impact on climate ring more often. For anyone sufficiently agitated about something, it can be provoking to encounter other people who seem to be far less disturbed by it.
I have just posted a relevant comment on another recent post (http://www.bishop-hill.net/blog/2015/1/5/sceptics-are-from-mars-and-warmists-are-from-venus.html) using the analogy of setting the sensitivity of a fire alarm to introduce/clarify the concepts of Type I and Type II errors. Prof Oreskes' wants to increase the size of alpha, the risk of a false alarm.
Having more people scared about our impact on climate would somehow ease her troubled mind. But I think it will not help with the progress of science, nor with the development of sensible policies.
The Error Statistics Philosophy article states (emphasis mine)
Until climate science starts dealing with biases and allowing for systematic errors then any confidence intervals are meaningless. With complex phenomena sparsely sampled (or sampling from a small sub-set of the full population - e.g. a short time period), even when you fully allow for all known biases there may be still unknown biases that have not been allowed for.
Kevin Marshall - I think the greatest bias is the temptation to make sure that the results support the settled science, driven by the noble cause of saving the planet, not to mention the funding, of course.
This reminds me of the cleverest man in the world who could discuss any topic at the very highest level. Provided he discussed quantum physics with a bishop and theology with a physicist
Oreskes makes a series of mistakes.
She notes that "correlation is not causation". This is true. Correlation is a necessary condition for causation, but not a sufficient one.
Oreskes, however, argues that significant correlation IS causation. That is just nonsense.
She overlooks that, because both climate and greenhouse gas concentrations are trending, correlation is meaningless. We should instead test for cointegration.
She also overlooks that cointegration tests have been applied and, Beenstock notwithstanding, have generally rejected the null hypothesis of no relationship between radiative forcing and temperature.
Per Nic Lewis, the discussion is about effect size rather than significance.
Oreskes then pleads for a Bayesian approach to hypothesis testing, without going all the way, leaving her in the no man's land between Bayesian and frequentist lines where everything is muddled.
Finally, she equates "human-made warming" with "catastrophic warming".
Which leaves us with one question only: How did she ever get into Harvard?
When I see Naomi Oreskes I am constantly reminded of the mad cat lady in the Simpsons.
Given the lack of evidence for CAGW (as opposed to AGW) and her history of Ad Hominen attacks, why on Earth does anyone bother with her insane utterances?
This is actually the strategy used to initiate laws against second hand smoke.
Much as I enjoy entering smoke-free establishments, bending the science to justify a personal preference is a devious and dangerous path.
Naomi Oreskes suffers from a type III error - she talks total tosh* and we've stopped listening.
*no, "tosh" was not the word I was thinking of.
@TinyCO2
Note that Type III errors are conventionally defined as "testing the wrong hypothesis", which is a mistake not made by Oreskes in this case.
Some classic statements from Oreskes:
"In the case of climate change, we are not dumb at all." [we know it all, the science is settled]
"We are now seeing dangerous effects worldwide" [yeah, all those storms and floods we never saw before on our TV screens. Our grandparents never saw them so it must be worse]
"The evidence is mounting that scientists have underpredicted the threat." [yeah, like its warming faster than predicted]
" we have underreacted to the reality, now unfolding before our eyes, of dangerous climate change." [like we haven't spent nearly enough $billions on it yet]
"The year just concluded is about to be declared the hottest one on record," [is that with more than 90% confidence?]
Oreskes is one of the untouchable modern royalty.
There is literally nothing she, or her kind, can say or do that will make our betters in the MSM shun her.
Richard Tol: "Correlation is a necessary condition for causation, but not a sufficient one."
Technically, since you mention co-integration, it's possible for a causative connection to show no *linear* correlation -- if (say) Y varies with dX/dt, X and Y may have a zero correlation depending on the nature of X(t). With the broader definition of correlation, though, it's correct.
Anyway, taking Oreskes at her word, I presume she is now shouting loudly that the multi-model mean trend in global average temperature is inconsistent with observations. No? Hmmm, imagine that.
Correlation is a necessary condition for causation, but not a sufficient one.
This statement requires some qualifying descriptions applied to 'correlation'.
A correlation form/function (1) that is derived from and is directly related to the fundamental laws of balance of mass, momentum, and energy, and the Equation of State for the material, (2) that contains, at the proper scale, the transport and thermo-physical properties of the material and (3) that is not a function of previous states that the material has experienced, might be good starting requirements.
@Harold
Sure, but let's not put the math hurdle too high for a Harvard professor.
And, of course, any function can be approached, locally, by a linear one.
...Since bystanders inhaled the same chemicals as smokers, and those chemicals were known to be carcinogenic, it stood to reason that secondhand smoke would be carcinogenic, too. That is why the Environmental Protection Agency accepted a (slightly) lower burden of proof: 90 percent instead of 95 percent....
I love it! It's a wonderful logical fallacy!
Since we know that climate warmists are wrong, I suggest that we adopt a lower level burden of proof for court cases charging them with fraud...
Richard Tol: He he...I wonder if there's an analogue to the "Gunning Fog" index which applies to maths?
It is interesting that in claiming 95% confidence levels are too restrictive Oreskes acknowledges that the people she is opposing are sceptical.
One wonders why she can't bear to call us Sceptics?
It seems that accurate naming of things is not her priority.
EO:
Just as well Rev Polkinghorne never made it to Bishop.
Briggs joins in. Naomi Oreskes Plays Dumb On Statistics And Climate Change.
Among many things, he picks up on her muddling of correlation with causation (also noted by Richard Tol above) "Typically, scientists apply a 95 percent confidence limit, meaning that they will accept a causal claim only if they can show that the odds of the relationship’s occurring by chance are no more than one in 20".
No no no no no no no no!
Discerning ‘blissful ignorance’ in our time
Perhaps it is possible for us to discern what blissful ignorance looks like on our watch. As long as experts willfully ignore the “system causation factor” of the human population explosion, as is occurring in our time, then the increasing food supply which is literally fueling the human population explosion will go on and on until there no way to grow more food for human consumption. We will continue to see the promulgation of politically convenient thought, economically expedient, culturally prescribed happy talk about the soon to appear demographic transition, the automatic stabilization of human population numbers and the end of human population by the middle of Century XXI. Science regarding ‘why the human population is exploding’ will continue to be denied and endless preternatural, ideologically-driven chatter about ‘what is happening’ will pass for a complete sharing of scientific knowledge. ‘What is happening’ will be broadcast ubiquitously. ‘Why it is happening’ will be treated as the last taboo, about which no one speaks. Just for a moment, let us imagine that now we have all the greatest population experts speaking with one voice. They tell us that we are headed rapidly for 8 billion people on the surface of Earth, declining TFRs in many European countries and elsewhere notwithstanding. When that number is reached in the foreseeable future, we will have too much food, too little water and clean air, and no decent environment to speak of. Pollution will be visible to all, everywhere. In the meantime many species of birds and wildlife will go extinct because of the destruction of their habitat from land clearance to grow more food to support an exploding human population. What is happening is made evident. Why this situation is occurring with a vengeance now here is ignore, avoided and denied assiduously. Silence prevails over science. All this is good, they say, because things are getting better.
All these top rank population experts, inside and outside the scientific community, then go on to say that in order to have more and more happy, healthy people we need more and more people who can be counted upon to increase the depletion and degradation that will rapidly subtract from the source of that happiness and well being, our planetary home, until such time as Earth is no longer able to function as a source of happiness and well being. More importantly, because the self-proclaimed experts are supposedly ‘free to know and to speak’ but talk only of what is deteumined by the powers that be to be best for the rest of us to know, some scientific research can be and will be denied. While these experts do not lie, they deliberately refuse to give voice to the whole of what is true to them, according to the lights and scientific knowledge they possess. By their conscious silence, these experts will ensure that the unsustainable growth of the human species, the reckless depletion of resources and the irreversible degradation of ecology of the planet happens as soon and efficiently as possible. All this is good, they say, because we are making things better and better for all those generations in future space-time who follow the greatest generation.
“Speak out as if you were a million voices. It is silence that kills the world.” — St. Catherine of Siena, 1347-1380
Richard Tol, I've never been accused of being conventional but I've still stopped listening to Naomi whatever number the error type comes in at. For some reason I find her work more irritating and daft than that of Dr Lew.
'Oreskes consistently misrepresents the meaning of statistical significance and confidence intervals to be about the overall burden of proof for a scientific claim:'
So in misrepresenting and BS in what why are they not following the normal practice of those working in climate 'science' ?
Note too that Prof John Brignell argues that p =0.05 already represents a dumbing down of statistical significance, let alone p=0.1
" The one in twenty lottery accepted as the norm
Statistical significance normally has two components, the relative risk (RR) and the probability (P) that the result occurred by pure accident. The acceptable limits chosen for these measures in epidemiology and related fields were from the outset regarded as very lax by traditional scientists. In particular, the limit P=0.05, now almost universal in epidemiology, means that there is a one in twenty chance that the result is by accident. If there are a hundred such results in an issue of an epidemiological journal, then by their own admission five of them are completely wrong. It is, however, worse than that. The provenance of P and RR depends on the maintenance of strict conditions (large, blind, randomised, controlled trials etc.). More often than not epidemiological and psychological claims are based on small, ill-controlled observational studies, making them essentially worthless. So why do they persist with this charade? Well, there are simply not nearly enough “discoveries” available to justify the numbers of people employed in the epidemiological scare industry, including publishers of journals, so they have to keep the mouth of the net wide. In consequence, most people now believe that the “scientists” are always contradicting each other, whereby the whole of science is tainted in the public eye. By sheer weight of repetition they have established that their style of lottery is “normal”. It is no more normal than swine jumping over a cliff. The widespread smoking ban was justified by the EPA’s one in ten lottery, an all-time low in standards. "
http://www.numberwatch.co.uk/statistical_bludgeon.htm
Here's a reminder of what xkcd has to say on the subject of 95% significance, causality and green jelly beans.
I'm an actuary and my wife is a biostatistician. I thought Oreskes' statistics discussion was probably wrong. My wife was sure. She stopped the reading the article early on.
This article makes me sad. Fifty years ago, Harvard U and the NY Times represented the height of professionalism and accuracy. Today, we can't count on either one. And, AFAIK no other institution has taken their place as exemplars of accuracy. Not having a data site that we can all rely on is a big loss.
Throughout my career in biomedical science, both in the laboratory and in a regulatory position, 95% confidence intervals have always been the gold (and only) standard, and I believe this the same for all proper Life and Earth Sciences. Any 'science' that needs to drop to a 90% CI to get the 'right' results is not Science but religion or mumbo jumbo.
Richard you ask "How did she (Oreskes) ever get into Harvard?"
One could ask a similar question of that other serial failure and doomsayer, Paul Ehrlich. How on Earth did he get into Stanford?
What is it about these otherwise excellent centres of scholarship that tolerate such dross?
Interestingly, having first dealt in clinical biochemistry and then computers and basic inferential statistics for more than 30 years, I always frowned upon a 95%confidence level. As a rule, I (and the team) used 97,5%. But I understand that 90% confidence, which is utterly ridiculous in actual science, means an astonishing success in social sciences studies.
I keep thinking that part of what we see, sheer propaganda excepted, still is the difference between narrative for the social sciences, arts, etc. and experimentation for the actual sciences. I never forget that a couple of friends of mine, both with a degree in Literature, took pains to understand that we didn't reach our conclusions in the library, but in the lab with experimentation. They were totally astonished.
Prof. Oreskes demonstrates the final manifestation of the entropy of intellectual corruption and mindless fanaticism: Rejection of science in the alleged name of science.
"I will leave substance of the climate change issue to others, but Oreskes’ methodological misidentification of the 95% coefficient of confidence with burden of proof is wrong. Regardless of motive, the error obscures the real debate, which is about data quality. More disturbing is that Oreskes’ error confuses significance and posterior probabilities, and distorts the meaning of burden of proof. To be sure, the article by Oreskes is labeled opinion, and Oreskes is entitled to her opinions about climate change and whatever. To the extent that her opinions, however, are based upon obvious factual errors about statistical methodology, they are entitled to no weight at all."
I like the end paragraph of his deconstruction of Oreskes.
It's true the Oreskes NYT article is a real muddle. Not good that a Harvard history professor identifies as the originator of confidence intervals someone (Fisher) who actively opposed them. He preferred so-called 'fiducial arguments'.
But that seems to miss the main point. Oreskes is effectively saying the probability, however construed, of "dangerous climate change" has been clearly demonstrated to be between 90 and 95% (0.90 and 0.94999.. to be accurate). The only obstacle to the action she covets is apparently some form of quasi-religious obsession with needing a probability above 0.95. But this implicit claim seems quite untrue. Unlike some commenters, a p-value of 0.05 has no special status to me. Depending upon the circumstances, a probability in excess of 0.90 could be very compelling. But, as far as I can tell, climate research is a long, long way from that sort of evidence for 'dangerous climate change'.
There are plenty of confusions in the Oreskes article, but the drift of what she wants is abundantly clear. The principle of falsification is to be subordinated to the precautionary principle: in other words, scientific method is to be subordinated to practical politics. Note the last bit about the rise in temperature by 1 degree Celsius: lower than predicted temperatures are not less reason to believe in causation but more reason for action. Post normal science at work.
Oreskes simply hasn't done or read enough real science, that's her main problem.
It is often drummed into science students at undergraduate level that just because something is statistically "significant", that does not mean it is relevant or "true". The drumming in often needs to be repeated at postgraduate level. And beyond, it would seem.
The real world has a really nasty habit of ramming sticks into the wheel spokes of people who are racing downhill, drunk on the "success" of passing a p=0.05 significance test. Much cAGW has now fallen below the level at which it should be publishable, never mind "true".
The proponents of catastrophic Anthropogenic Global Warming need to adopt a weaker hypothesis in order to bring it back from over the horizon of credibility. Dropping the "catastrophic" might achieve this. But that would mean dropping a lot of political and economic arguments too. It says a lot about those who are willing to take that step, and those who are not.
95% isn't the level of the burden of proof for a scientific claim to be held true, it's the level at which a result is worth publishing.
The method works like this. A scientist makes an observation that provides some quantity of evidence to support a particular hypothesis over its alternatives. In the old days, he would write to other scientists and ask them to check it. Nowadays, with the volume of research going on this isn't practical, so journals collate such results and provide a selection of the most interesting ones to other scientists working in the same field (for a price, of course). The other scientists then check the observation - confirm that the reasoning and calculation is sound, consistent, and repeatable. They publish their confirmation or rebuttal, and the results that survive a sufficient number of attempts to destroy them become generally accepted.
Obviously, if every slight departure from background noise was reported, the journals would be full of junk, most of which wasn't true and would be immediately rebutted. So you need to set a threshold level before a result is interesting enough to be worth other people's time to check. The most common convention is 95%, but some high-volume fields may set a higher level such as 99% or 99.9%. The threshold for scientific acceptance (although nothing is ever final) is much higher, requiring some number of 95% confirmations and replications of the original result. The number required depends on how plausible the hypothesis was to start with - what Bayesians call the 'prior'. As Carl Sagan put it: "Extraordinary claims require extraordinary evidence", and you'd need many more replications to convince people of ne of the more surprising results.
If 95% was really the standard of evidence for accepting a claim, science would be severely limited in the complexity of the reasoning it could support. A scientific result is generally built on a foundation - a chain of argument - of previously discovered scientific results. Only if all the steps of the argument are correct can you rely on the final result. The probability of n steps all being correct is 0.95^n. Thus, you can construct a chain of reasoning depending on only 12 results each demonstrated at 95% before the overall probability of the argument being valid dropped below 50%.
The closer a result is to 100%, the more steps you can chain together, and the more complex you can make the science. Results that a lot of other work relies on are therefore particularly important to demonstrate at a higher level of certainty, by checking the argument and replicating the observation. Likewise, results for which the stakes are very high (like the economies of the free world) are also set a high burden of proof. Do you think we'd last 5 minutes if so much as 5% of our science was wrong?! You're having a laugh!
So in a sense, it doesn't really matter that much if you pick 95% or 90% - a lower standard just means you've got more rubbish to sort through. You need more replications and checks, it limits the complexity of the science you can do, and it takes you longer to get anywhere.
There's also the Bayesian argument that the confidence value is the wrong number anyway - really you need to consider the likelihood ratio. Confidence levels are among the most badly misunderstood statistics in all of statistics. They're only really of use because likelihood ratios are generally too difficult to calculate in more open-ended, speculative science, which is where all science starts.
But Naomi's purpose is clearly to lower the bar to bad science, which is what she's pushing. Point that out, and move on. It's not worth spending time over.
Likelihood ratios - well, log-likelihood ratios since the log of a ratio is the difference, which is much easier to calculate for complex problems, are the domain of those of us steeped in detection and estimation theory. They are my friends.
Mark
I am also glad I was not the only one to catch Richard Tol's causality issue. His response, of course, was awesome. :)
Mark
Glad to see so many from this site coming over to my blog today. My main focus on the blog is philosophical foundations of statistics, in various realms of science and non-science. I hope some of you return; I'll check back here too. D. Mayo error statistics.com
I'll leave Oreskes' statistics to D.G.Mayo and Nathan Schachtman, both of whom far exceed my statistical competence and who have done an excellent job of filleting Oreskes' op ed.
But what struck me as ironic is that she appears to be ignorant of non-lineraity in both climate science and toxicology: she seems ignorant of the widespread acceptance of hormesis in the latter (which renders obsolete her suggestion that we should assume there is no safe dose of any chemical (including cigarette smoke); and she also appears ignorant of the logarithmic nature of climate forcing by carbon dioxide (so that further increases have a diminishing effect). CAGW, of course, depends entirely upon modelled positive feedback from water vapour, for which the observational evidence is poor.
Again ironically, Poster Gary at William Briggs site shows that Oreskes' view on models seem to have changed:
'Oreskes was a lead author on a 1994 paper in Science titled “Verification, Validation, and Confirmation of Numerical Models in the Earth Sciences” that concluded with this paragraph:
'Finally we must admit that a model may confirm our biases and support incorrect intuitions. Therefore, models are most useful when they are used to challenge existing formulations, rather than to validate or verify them. Any scientist who is asked to use a model to verify or validate a predetermined result should be suspicious.'
'Read the whole paper and marvel.
http://www.likbez.com/AV/CS/Pre01-oreskes.pdf'
If you use 90% confidence intervals in comparing the temperature record with the models, wouldn't that make the record fall <I>further outside</I> the models' range?
She actually endorses the EPA's manhandling of the second-smoke issue. Perhaps she's been hearing lots of pushback on the claims frequently made by climate activists linking skeptics to tobacco.
Short version of this bizarre argument "Correlation is not causation except it is, and false results don't matter anyway".
This is simply too crazy to be true, even for a warmist, even for an ignoramus. I mean, come on... even schoolchildren could get this right!
My gut reaction was "this is/has to be WRONG"; some crazy "journalist" made this up (making stuff is something French journalists do, they even have name for it: "a narrative").
Now, with no reaction from the academic named... this would be really real?
And alarmists still want her in their team? Nobody is that desperate!
I feel dizzy.