Buy

Books
Click images for more details

Twitter
Support

 

Recent comments
Recent posts
Links

A few sites I've stumbled across recently....

Powered by Squarespace
« Telling lies for "the cause" | Main | Mann emails update »
Friday
Feb032012

Cherrypicking

There has been a lot of blog battling and twitter twootling started by this WSJ article
with climate crossness here and here and here, all brilliantly buffed and rebuffed by Matt Briggs

A lot of the argument was about 'how to do climate graphs' with input taken from Skeptical Science.

H/t MrSean2k at WUWT who finds a rather good example.

So here is a quick lesson in how to do Climate Graphs - please concentrate.

 

From Cartoonsbyjosh.com (click image for larger version)

PrintView Printer Friendly Version

Reader Comments (144)

Indeed.

Mark

Feb 4, 2012 at 4:04 AM | Unregistered CommenterMark T

I understand well that smoothing with a long period on wiggly data destroys peaks and troughs. Not quite so relevant if we are only look at short periods, like five years.

We have a temperature series, for the sake of argument at a single place and a single time (to avoid the whole averages of averages of averages which besets climate). How do we show any trends over time? We can put a linear fit – but most people here really dislike that, with good reason. There is no justification for expecting an exponential, log, polynomial etc. But some gentle smoothing sometimes helps to see the underlying pattern.

No-one's suggesting that we discard the original data. No-one suggesting that that it won't hide some features. No-one's suggesting that it is a cracker statistical analysis on which scientific assessments can be made. All a five year smooth does is make it a lot easier to see some patterns that are otherwise hard to see.

So arguments above about how much information are lost are beside the point. Different analyses are made for different purposes. A five year smoothing can be done to see a general pattern in some noisy data without it meaning anything else. When we take an average we lose lots of information: should we stop ever using averages? Or do we accept that sometimes they are a helpful shortcut?

I know that there is a lot of poor statistics in climate science, but we shouldn't throw the baby out with the bath water. If a five year running average is made to distort the end period, then that should be dealt with as an issue in its own right. Not all running averages all over the world automatically invalidated.

Despite some of the wild statements above, the world of statistics does use running averages and does use smoothing methods. When appropriate. A quick Google search shows plenty of Statistics Departments using them to help make data more easily visualised.

Feb 4, 2012 at 4:43 AM | Unregistered CommenterMooloo

Welcome Mark, I have been trying to say for some time that the climate scientists are "dentists" in three particular areas where they claim to be "heart specialists". These three areas are the domain of engineers, a profession which in my past life I was educated. Unfortunately the passing of the years and the rise to a position where my "engineering" was mainly involved with raising and spending money, including on research, the details, if I ever remembered them, have disappeared from an already feeble mind. However, when reading the blogs and understanding what climate scientists do I found them in three domains which I considered to be engineering, and where I know all or most of the experience and knowledge lies. These domains are:

1. Modelling
2. Signal processing ( my text books came from the University of Southampton)
3. Feedback theory and control systems

What I could remember was that engineers didn't of course model coupled non-linear chaotic systems, for two reasons, 1. it's impossible and 2. there is no value in doing so because of 1. And if ever I saw a coupled non-linear chaotic system it's the climate. Despite the fact that the climate scientists use weasel words like "projections" to give the outputs from their models, they are, in fact, making forecasts which are being used by policy makers. As in my day there weren't computer models, there were indeed practical models and I can well remember the boss sucking on the end of his pipe and shaking his head at any suggestion that the model did anything other than giving us and indication of what might happen, all other things being equal, in the real world.

The second problem I saw was with the signal the climate scientists are dealing with. Of course sans the raw data it is difficult to be emphatic that they are dealing with a considerably noisy signal (from tree rings, ice cores, sea shells etc) and although I could accept that if they were all gave the same result there may be something in there to take as real, but not enough to base any assumptions about climate, or climate change on.

The third is on feedback, a topic almost as esoteric as signal processing in it's own way. Not just engineers but biologists and other professions understand feedback and how it works, and although there may have been a time when for the sake of a paper in my exams I had a deep understanding of it, I don't now. My simple understanding is, based on my memory, is that if a system is inherently unstable. prone to positive feedbacks, and has no in-built dampeners, it will swiftly destroy itself. I know astronomical timescales are a lot longer than we're used to, but at a guess over the course of 5bn years we have seen circumstances similar, or the same as todays. If we have not then what we are expected to believe is that today's climate conditions are the unique in the last 5bn years, I don't believe that, but of course can't prove it because the signal is so noisy.

Anyway welcome aboard look forward to more detailed discussions and will have to return to the text books (why did I feel like that meant "go to the mattresses!?).

PS. Give BBD a break, he's had a Damascan moment in that he saw the real true god and it's environmentalism. I don't believe he fell off a horse though.

Feb 4, 2012 at 7:02 AM | Unregistered Commentergeronimo

MarK: I'm really sorry to hear about your marriage. If there's any way not to give up on it ... don't. I speak from the worst kind of experience.

Feb 4, 2012 at 8:27 AM | Unregistered CommenterRichard Drake

richard verney 12.10 AM
If you're having problems with URLs, why not think about using TinyURL. I find it very useful.

Feb 4, 2012 at 10:16 AM | Unregistered CommenterMike Jackson

Mark T (assuming it is the same one!) is a very experienced chap from the signal processing community who knows his stuff. He goes waaaay back to the early days of climate audit when we were having a good laugh at Mann's incursions into signal processing (e.g. his strange smoothing endpoint methods). Mark's comments would be recognised immediately by anyone else in the sig proc community as being accurate and well thought out with a good depth of understanding.

And yes, one hour at google university isn't going to be a replacement for some very serious studying. You don't have to do a degree to get an understanding, but you are going to have to put in some hard work. (There is a reason why EE degrees at uni may have a 35-40hr/wk of scheduled lectures/labs/etc - this excludes coursework, reading etc - in comparison to the average degree course which is a small fraction of that).

However, mooloo takes one important step towards a more accurate statement here:

the world of statistics does use running averages and does use smoothing methods. When appropriate.

The key word is "when appropriate", as opposed to "universal". Those two words convey rather different meanings, and the term "when appropriate" is key. You need to understand the characteristics of the signals you are processing in order to determine what effect averaging has. In many cases averaging can make signal to noise ratio worse. It depends on the underlying distributions, the independence of the samples, the signal to noise ratio of individual samples, the spectral characteristics of the time series, whether the components are linearly separable, coupled, etc. etc.

And simple averaging is far from always the right thing to do. There are many forms of averaging signals (arithmetic; geometric; harmonic; weighted; prewhitened; etc etc) and the right method to use is dependent on the underlying relationships between the samples. Once again, choosing the wrong method can make the signal to noise ratio worse, not better.

The first thing a sig proc engineer would do would be to get a good understanding of the characteristics of the signal and the noise. Without it, simply arguing we should average is dangerous. Even more so if it is asserted as being "universal", a word that trigger alarm bells in even the most moderate sceptic. And there are good reasons to doubt averaging does anything useful: if natural climate variability is fractal (as many have suggested, with evidence and physics to back their claims up), then averaging has no effect whatsoever on signal to noise. It merely removes information from the plot, makes it easier on the eye, and allows people who are easily deluded to "see" relationships in the data that aren't really there.

And sig proc engineers probably have more experience of this than most others. Because they do this stuff for a day job :-)

PS. Sorry to hear about your problems at home, Mark :-(

Feb 4, 2012 at 10:59 AM | Unregistered CommenterSpence_UK

Mark T

Every one of these statements is fundamentally incorrect:

Yes? Why?

Do explain yourself rather than just ranting.

Feb 4, 2012 at 11:23 AM | Unregistered CommenterBBD

Signal minus noise = signal.

What an astonishing exercise in waffle.

Feb 4, 2012 at 11:25 AM | Unregistered CommenterBBD

if the signal lies outside of the bandwidth of the smoother, you will not reveal any underlying signal. You will attenuate it.

Yes. This is obvious, and in the case of smoothing a temperature time series as above, utterly irrelevant. You have talked much, confused many, and said nothing.

Others - it's important to sort all kinds of noise from the signal. An exercise for the interested reader.

Feb 4, 2012 at 11:28 AM | Unregistered CommenterBBD

Mooloo

Thanks. A lone voice of informed reason. They're going to love you here.

Feb 4, 2012 at 11:30 AM | Unregistered CommenterBBD

If anyone is reading this thread, pause to ask yourself why so much vitriol has been sprayed. Why has bog-standard smoothing been 'attacked' with a barrage of envenomed pseudo-science?

Bizarre, isn't it? But of course the reason is obvious: the signal you see more clearly is one that many simply do not want to acknowledge.

Feb 4, 2012 at 11:47 AM | Unregistered CommenterBBD

Mark - I echo Richard Drake's comment at 8:27. Not a nice place to be.

Feb 4, 2012 at 12:10 PM | Unregistered CommenterJames P

Mark T, superb comments, concise and clear too, many thanks.

And I join Richard D and others here and wish you well.

BBD. You are teasing us, yes?

Feb 4, 2012 at 12:28 PM | Unregistered CommenterJosh

JJ

At 9am here it was -6C and two hours later, +6C. Looking forward to a barbecue this afternoon!

Feb 4, 2012 at 12:32 PM | Unregistered CommenterJames P

Mooloo

"No-one's suggesting that we discard the original data"

Although some people are very reluctant to share it.. :-)

Feb 4, 2012 at 12:38 PM | Unregistered CommenterJames P

Signal minus noise = signal.

Oh boy. Thank you BBD, for this quote. I think I shall print it out and pin it on the wall at work (with proper attribution, of course). I am confident it will bring fits of giggles for many a coming year.

(PS. Averaging is integration and division, subtraction is not involved.)

(PPS. Subtraction as an analogy is so flawed that only someone who does not understand what is happening during averaging would make it)

(PPPS. Assuming noise is skewless and bipolar, signal minus noise = signal plus noise for any valid statistical measure)

Feb 4, 2012 at 1:25 PM | Unregistered CommenterSpence_UK

Spence_UK

You do that. Here it is again:

signal minus noise = signal

Now, from the Olympian heights of your wisdom, explain why it is wrong.

(You will note that the statement makes no mention averaging or indeed any specific method for removing the noise from the signal.)

But if we do use averaging to smooth for noise we can rest assured that it is an uncontroversial and standard method for examining temperature time series. Which was my original and substantively uncontested point.

This thread has been a master-class in diversionary nonsense. You have added to it in your usual pompously offensive way.

Feb 4, 2012 at 2:17 PM | Unregistered CommenterBBD

Josh

No, I am not teasing you. Smoothing a temperature times series by averaging (eg monthly to annual means etc) is standard, uncontroversial stuff. Also please see my comment at Feb 4, 2012 at 11:47 AM.

Feb 4, 2012 at 2:19 PM | Unregistered CommenterBBD

All

Before anyone posts any more snark about my supposed stupidity, they need to do one simple thing. Go and find a reputable source that states unequivocally that the kind of smoothing I have been doing here (eg monthly to annual means) is an inappropriate way of examining a temperature time series of several decades or more.

No link - no point. No point - no response.

Thank you.

Feb 4, 2012 at 2:23 PM | Unregistered CommenterBBD

BBD. Does this help?

http://wmbriggs.com/blog/?p=5154

Feb 4, 2012 at 2:29 PM | Unregistered CommenterJosh

BBD
The 'ranting' is borne, partly, of frustration. That you pretend that what the IPCC did was correct, and wont listen to reason.

Why what they did was wrong, can be explained.

This is what the IPCC says:

Annual global mean observed temperatures1 (black dots) along with simple fits to the data. The left hand axis shows anomalies relative to the 1961 to 1990 average and the right hand axis shows the estimated actual temperature (°C). Linear trend fits to the last 25 (yellow), 50 (orange), 100 (purple) and 150 years (red) are shown, and correspond to 1981 to 2005, 1956 to 2005, 1906 to 2005, and 1856 to 2005, respectively. Note that for shorter recent periods, the slope is greater, indicating accelerated warming

This graph contains two fundamental errors:
[1] in drawing an inference, w.r.t., where 'we are', and,
[2] in its argument thereof

The first point is fairly simple. We, are at 2005 with respect to the data shown in the graph. If I go backward, including longer and longer periods of time to calculate a rate of warming, the rate warming actually goes down, and not up.

But then of course, the IPCC exposes this only inadvertently. How can I make a statement about the rate of temperature rise from 1980 to 2005, by comparing it to the rate of rise from 1860 to 2005? The latter includes the former!

How does one evaluate whether or not the rate of rise is going up or down then? After all, the data to be used is the same. The answer is simple - examine the graph afresh.

Though the IPCC has drawn its thick lines, one can discern warming and cooling episodes. The total warmth must have taken place during the warming epidodes. Draw straight trend-lines if you wish through the warming episodes: 1860-1885, 1900-1940, and 1980-2005. The rate of warming is roughly the same.

Though [1] is fairly simple, it is an error borne of perspective, - which are amongst the hardest to detect and correct. No wonder BBD has a problem.

Feb 4, 2012 at 2:51 PM | Unregistered CommenterShub

And the reason for the warming is no longer in dispute. No wonder Shub has a problem.

Feb 4, 2012 at 2:56 PM | Unregistered CommenterBBD

Josh

Not really - it's not at all what I asked for.

Also, wrt all this fuss about variable start dates for trend fitting - remember that I provided one example of trends (arbitrarily fit to 1900 and 1950 - the half-century - no other reason) then went on to use averaging rather than trends (Feb 3, 2012 at 10:34 PM) to show the same thing. No start dates to argue about. But immediately the spurious nonsense begins about how smoothing is evil etc etc.

So, back to Feb 4, 2012 at 2:23 PM - where's that link?

Feb 3, 2012 at 10:34 PM

Feb 4, 2012 at 3:07 PM | Unregistered CommenterBBD

"Yes? Why?

Do explain yourself rather than just ranting."

I explained every one of my points. Can you not read?

"Yes. This is obvious, and in the case of smoothing a temperature time series as above, utterly irrelevant..

It is completely relevant. You do not know what the "signal" is. If you did, you would justify it.

"You have talked much, confused many, and said nothing."

I've confused you, yes, but that goes back to my original asseretion that you have no clue what you are talking about.

"Others - it's important to sort all kinds of noise from the signal. An exercise for the interested reader."

Indeed. Learn something so you can add to the understanding.

"Go and find a reputable source that states unequivocally that the kind of smoothing I have been doing here (eg monthly to annual means) is an inappropriate way of examining a temperature time series of several decades or more."

Again, show me where anyone said this. This will be the FIFTH time in this very thread I've unequivocally stated my objections to your claims had nothing to do with "problems."

Mark

Feb 4, 2012 at 3:09 PM | Unregistered CommenterMark T

Though [1] is fairly simple, it is an error borne of perspective, - which are amongst the hardest to detect and correct. No wonder BBD has a problem.

So use a running mean. Like this:

HADCRUT3vgl 5 year mean.

Clear as a bell. Now Shub, who's really being obtuse here?

Feb 4, 2012 at 3:13 PM | Unregistered CommenterBBD

MarkT

All this fuss over what? Smoothing a temperature time series by averaging? What is your problem? Really?

And would you mind either ditching the pissy tone or taking your frustrations out on someone else.

Feb 4, 2012 at 3:20 PM | Unregistered CommenterBBD

"You do that. Here it is again:

signal minus noise = signal

Now, from the Olympian heights of your wisdom, explain why it is wrong."

Because signal - noise = signal - noise, not noise, unless the noise is zero. That's sort of like saying 2 - x = 2... only works if x = 2.

Now, if you mean sampled data - noise = signal, sure. I've said so much above. What you don't seem to understand is that you need to define one or the other in a non-circular manner for this to have any meaning. In other words, you can't just say anthing that's not noise is signal, without defining what either noise is independent of signal, or what signal is, independent of noise.

By imposing your model of a moving average filter, you've stated that the best estimate of the "true signal" is an average of the neighboring points (the number of which depending upon the length of the filter). It is up to you to provide justification, a physical reason for this, for it to have meaning. Can you do that? Or is your excuse simply that other climate scientists to so?

Mark

Feb 4, 2012 at 3:29 PM | Unregistered CommenterMark T

"All this fuss over what? Smoothing a temperature time series by averaging? What is your problem? Really?"

Why don't you try reading my posts. I've made it clear.

Mark

Feb 4, 2012 at 3:30 PM | Unregistered CommenterMark T

"Because signal - noise = signal - noise, not noise, unless the noise is zero. That's sort of like saying 2 - x = 2... only works if x = 2."

Oops, if x = 0. Duh.

Mark

Feb 4, 2012 at 3:31 PM | Unregistered CommenterMark T

BBD, I'm rather puzzled by your statement "And the reason for the warming is no longer in dispute"... are you seriously suggesting that the driving factors behind the measured warming is fully understood, or are you just being 'provocative'?

Feb 4, 2012 at 3:32 PM | Unregistered CommenterDave Salt

BBD

And the reason for the warming is no longer in dispute.

Right. Have you looked outside if you live just about anywhere in Europe today? Hell, even the canals in Venice are icing up.

about the thread

I really appreciate the effort a number of you, Mark T , in particular, in attempting to explain that all the world is not a nice neat straight line. Also an excellent effort out of Spence_UK . But it must be to geronimo that I award the Don Pablo Lance of Explanation for pricking the real issue with:

What I could remember was that engineers didn't of course model coupled non-linear chaotic systems, for two reasons, 1. it's impossible and 2. there is no value in doing so because of 1. And if ever I saw a coupled non-linear chaotic system it's the climate.

Sadly, several others, who like our politicians look at the world though eyes trained in Natural Philosophy and not Physics, really believe you can obtain a signal far smaller than the noise by simple averaging.

I guess my question is how do you explain to them "it ain't the way it works"? That is the real challenge, I am afraid.

I want to thank a number of you for refreshing my knowledge of signal processing. This has been a most refreshing read.

Feb 4, 2012 at 3:33 PM | Unregistered CommenterDon Pablo de la Sierra

It is up to you to provide justification, a physical reason for this, for it to have meaning. Can you do that? Or is your excuse simply that other climate scientists to so?

Oh I get it. You are denying that GAT has risen over the C20th. Okay.

You're talking to yourself, Mark. I'm not wasting time with that 'argument'.

Feb 4, 2012 at 3:36 PM | Unregistered CommenterBBD

Dave Salt

It's CO2.This isn't contentious except to a few contrarians such as (apparently) yourself. And I'm not going to argue about it here.

Feb 4, 2012 at 3:38 PM | Unregistered CommenterBBD

"Despite some of the wild statements above, the world of statistics does use running averages and does use smoothing methods. When appropriate. A quick Google search shows plenty of Statistics Departments using them to help make data more easily visualised."

I have not suggested anything other than for BBD to actually learn what it is that he is saying and why it is incorrect. When statisticians use smoothing, they have a physical reason to make the claim "the signal I am looking for is a low frequency term." Same goes for signal processing types. This is the type assumption that goes into ANY linear transformation - an assumption regarding the physical structure of the underlying "signal" as well as the junk (noise) that you hope to ignore.

When I apply a filter to data, I know precisely where my signal lies in the spectrum. The filter I apply is designed specifically to deal with said signal. I also understand exactly what I've done to the data in both time and frequency spaces.

BBD seems to think he has not imposed anything on the data when he applies a filter. This is simply not true. He thinks because climate scientists do this regularly, it must be acceptable. This too, is untrue, or at the very least, needs to be caveated. Climate scientists regularly do many things that are either inappropriate or outright incorrect. They don't have the training and for whatever reason, refuse to listen to those of us that do. It is maddening.

Mark

Feb 4, 2012 at 3:43 PM | Unregistered CommenterMark T

" You are denying that GAT has risen over the C20th. Okay."

What? I've said no such thing.

Bish: this is why he gets the snark. He can't even directly respond to anything.

Mark

Feb 4, 2012 at 3:44 PM | Unregistered CommenterMark T

Spence_uk: yes, same Mark T as over at CA. I've been posting here for a while, though never in this much detail. There's less noise here than over at WUWT, excepting the diehards, and it tends to be not quite as focused as tAV or CA.

Sometimes you just need to step up and correct obviously incorrect statements. Perhaps if BBD would actually think about what I wrote, not what he thinks I wrote or what he thinks I implied in what I wrote, he'd learn something. I mean, c'mon, "why not explain why instead of ranting" after I posted a detailed description regarding every single one of my objections, all numbered from 1 through 4. One can only reason that he either does not care, is incapable of understanding, or realizes that I am correct but cannot admit it.

Figures.

Mark

Feb 4, 2012 at 3:57 PM | Unregistered CommenterMark T

Actually Mark, I didn't realise that comments had gone to two pages. It was indeed careless of me.

However, when I did realise my mistake and read the rest, I found very little actual substance. There was this:

if the signal lies outside of the bandwidth of the smoother, you will not reveal any underlying signal. You will attenuate it.

Which I happily agree with. But it does not apply when averaging monthly means to annual means in a multi-decadal temperature time series like HADCRUT. You are making a great deal of noise about nothing.

Although I note the usual tactical swipes at the competence and integrity of earth system scientists. Which is why I don't take you very seriously now.

Feb 4, 2012 at 4:02 PM | Unregistered CommenterBBD

You first whined that no one has offered a simple explanation for why what the IPCC did was wrong:

Here it is:

"We, are at 2005 with respect to the data shown in the graph. If I go backward, including longer and longer periods of time to calculate a rate of warming, the rate warming actually goes down, and not up."

If you don't have an answer to the above, just man up and admit it. You did not make that graph - you don;t have to shill for it.

Feb 4, 2012 at 4:03 PM | Unregistered CommenterShub

And since I've had a fair bit of unpleasantness from you now, let's get something clear. You say:

Sometimes you just need to step up and correct obviously incorrect statements. Perhaps if BBD would actually think about what I wrote, not what he thinks I wrote or what he thinks I implied in what I wrote, he'd learn something.

The sole reason this exchange went so badly is that you began it badly. So badly that BH told you to wind in your neck. You repeatedly refused to clarify what your problem was. Repeatedly. You created trouble here - which is technically known as trolling.

Think on that, please.

Feb 4, 2012 at 4:07 PM | Unregistered CommenterBBD

Mark T. - Thanks for coming over here, appreciated.

BBD - as you know I am "one of the few" who thinks that CO2 has feck all to do with it. Talking of signals, I have probably asked you this before, but don't recall getting a reply; could you please let me know where the CO2 signal is in this graph?

wood for the trees...

Feb 4, 2012 at 4:09 PM | Unregistered Commenterlapogus

Sadly, several others, who like our politicians look at the world though eyes trained in Natural Philosophy and not Physics, really believe you can obtain a signal far smaller than the noise by simple averaging.

Well, you can do this, but you have to recognize what it is you have done to the original data. Moving averages have a rather noticeable slope across the passband, so any really low frequency signal will be "brought out" much more than one closer to the foldover point, this means that if the dominant term is closer to the foldover point, it may be attenuated enough that a lower frequency terms becomes the dominant term.

I guess my question is how do you explain to them "it ain't the way it works"? That is the real challenge, I am afraid.

You can't. They don't want to hear it. Look at BBD's responses to me. Not once has he even acknowledged that he understands my objections to his terminology. He's been tilting at windmills the whole time. If he directly addresses my statements, he will have to admit I am right or admit that he does not understand, and he knows this. Instead he dodges, argues points I never made.

I want to thank a number of you for refreshing my knowledge of signal processing. This has been a most refreshing read.

Wait till I get going on feedback (control) theory. Damn they abuse that one even worse. The problems I had w.r.t. this stuff don't make much of an impact other than one of understanding what it is that is being discussed. There's nothing wrong with smoothing data then looking at it (but little else), you just have to realize what you've done - few do. But their claims w.r.t. feedback DO have a huge impact.

Mark

Feb 4, 2012 at 4:11 PM | Unregistered CommenterMark T

Shub

Got as far as 'whined'. Realised you were being offensive and nearly stopped reading.

Now I'm going to have to struggle to be civil to you.

Your deconstruction of the IPCC cartoon is deeply odd. The IPCC states that the rate of warming has increased during the C20th. So if you "go backward, including longer and longer periods of time to calculate a rate of warming, the rate warming actually goes down, and not up".

Well obviously it does. Sometimes I really don't know what to say to you. This is certainly one such occasion.

Feb 4, 2012 at 4:13 PM | Unregistered CommenterBBD

"You repeatedly refused to clarify what your problem was. Repeatedly."

Have you even read post #s 43, 46, 47 and 48? I explicitly clarified in those posts.

Now you can apologize for lying.

Mark

Feb 4, 2012 at 4:13 PM | Unregistered CommenterMark T

Mark T

Look at BBD's responses to me. Not once has he even acknowledged that he understands my objections to his terminology.

See Feb 4, 2012 at 4:03 PM

Feb 4, 2012 at 4:14 PM | Unregistered CommenterBBD

BBD, you are hardly in a position to complain about snark.

Allow me to fix your wrong equation for you, and then explain why the fixed version of your equation is pointless, then explain how averaging actually works and some assumptions required to support it, and then explain why there is a fundamental problem with its application in temperature time series. (My first few notes are with regard to signal to noise generally, not specifically to the climate case)

OK. Fixing your equation. What you should have written is "Signal plus noise minus noise equal signal"*. That would have at least been algebraically correct. But there is a problem with the application: here, the two "noise" terms must be identical samples of noise. I'll call them N1 to indicate they are the same noise samples. (N2, N3... would be different samples). The problem is all we have to begin with is the estimate (i.e., the measurement), which I will call E, signal S:

E = S + N1

What you want to do is rearrange the equation to be

S = E - N1 ( = S+N1-N1, per your equation)

Trouble is, we know E, but we don't know N1. Two unknowns, one equation. It cannot be determined. So we cannot reach S via your method of subtracting noise.

Now, what is really going on is integration (summation in discrete terms) and division. Say we have two estimates of S, E1 and E2. They are estimates of the same variable (i.e. S has unchanged - not necessarily true for the climate case, but simple to aid explanation). These have independent noise added to them, N1 and N2, giving us the expressions:

E1 = S + N1
E2 = S + N2

We can now sum E1 and E2 to give:

E1+E2 = (S+S) + (N1+N2)

Now S+S are the same variable, so this comes to 2S, but if we assume N1 and N2 are independent identically distributed gaussian random variables, they can be assumed orthogonal and summed as such; we then have (N1+N2) = sqrt(N1^2 + N2^2); statistically, they can then be combined as (N1+N2) = sqrt(2N3^2) = sqrt(2)N3.

E1+E2 = 2S + sqrt(2)N3

We then divide both sides by 2:

(E1+E2)/2 = S + sqrt(2)N3/2

And we have the true effect of averaging; not your equation (which is wrong and meaningless), but a reduction of signal-to-noise ratio of sqrt(2)/2, or 1/sqrt(2). But you'll note when I was doing my algebra, to make it work out, I had to make a whole host of assumptions. If the assumptions are wrong, these calculations do not apply.

Let us just look at one assumption: the assumption of independence of samples. If the noise is not independent, then you end up with

E1 = S + N1
E2 = S + N1
(E1+E2)/2 = S + N1

In this example, the signal to noise ratio is exactly how it started out. OK, for climate, the noise is neither independent nor is the noise identical from sample to sample. So we need a method for dealing with the non-independence. That's where the Hurst exponent comes in.

The Hurst exponent is an exponent which corrects for the reduction of noise through averaging. The noise magnitude, measured by the standard deviation of the noise (as per above), reduces by averaging N samples as follows:

SNR_N = SNR_1 / (N ^ (1-H))

Where H is the Hurst coefficient for the series. For the iid gaussian random variables as per the example above, H = 0.5, so the rate of averaging gives:

SNR_N = SNR_1 / (N ^ 0.5)

So, for two samples, we get 2 ^ 0.5, or the square root of 2, matching my equations above (see, when you do algebra properly you get the same answers using different methods). For 12 samples (e.g. monthly to yearly for iid noise), we get 12 ^ 0.5, or an reduction in noise of about 3.5.

But climate doesn't have an exponent H = 0.5. We can measure the Hurst exponent of climate: for the global temperature record, estimators yield a value somewhere between 0.94 and 1. For a value of say H = 0.97, we find that averaging 12 numbers gives us a signal to noise ratio improvement of:

SNR_12 = SNR_1 / (12 ^ 0.03) = SNR_1 / 1.077

Virtually nothing. We probably can't even estimate the noise power to that accuracy with just 12 samples. In fact, to get an equivalent SNR reduction that can be achieved by averaging just two samples of iid random variables, we find we need to average X points as follows:

X^0.03 = sqrt(2) => 0.03 log X = 0.5 log 2
log X = (0.5/0.03) log 2
X = 104,030 monthly samples or 8669 years of data

So to get the equivalent signal-to-noise improvement equivalent to averaging two throws of a die, we have to average more than 8 millenia of monthly global mean temperature data. Not 30 years, not 60 years, not 110 years, but 8,669 years. To get *2* independent samples.

And this is even BEFORE we consider the problems that S is not constant; that the distribution may not be gaussian; etc., etc.

You ask for a link to an article, and as ever I can provide you with one: here. But be warned, you're going to need to sharpen your algebra up a bit to even get started on this stuff.

* Mark beat me to this. It appears my post took a bit too long to write!

Feb 4, 2012 at 4:15 PM | Unregistered CommenterSpence_UK

Mark T

'Lying'?

See Feb 4, 2012 at 4:02 PM

Feb 4, 2012 at 4:15 PM | Unregistered CommenterBBD

Yes, lying:

You repeatedly refused to clarify what your problem was. Repeatedly.

And I have REPEATEDLY told you to read my posts, now specifically post #s 43, 46, 47, 48, in which I do exactly what you ask. I clarified what my problem was.

Are you going to read them and acknowledge that I have addressed your concerns, or are you going to continute to pretend I did not?

Mark

Feb 4, 2012 at 4:18 PM | Unregistered CommenterMark T

Mark T

This blog doesn't do numbered comments, so ??

I was referring to your first appearance on the thread and my responses to you. Specifically your comments at:

Feb 3, 2012 at 10:51 PM

Feb 3, 2012 at 11:28 PM

Feb 3, 2012 at 11:44 PM

Feb 4, 2012 at 12:01 AM

I get the strong impression that you should calm down. Have a cup of tea or something. Perhaps leave this (or at least re-read the thread).

Feb 4, 2012 at 4:20 PM | Unregistered CommenterBBD

And Mark

You are trolling.

Feb 4, 2012 at 4:20 PM | Unregistered CommenterBBD

This is called spamming

The comments section does not exist for your whim to respond to every arrow that flies your way. Don't do it and choke the board. I realize you have lots of time on your hands but have some mercy on the reader.

The IPCC says:

"Note that for shorter recent periods, the slope is greater, indicating accelerated warming"

If what I say dawned on you, you must agree that it is the opposite of what the IPCC did.

If I travel from A to B, covering the distance in four different phases, and in each my speed being greater than the previous, and if I wished to compare the phases and make a statement about the rate of speed being higher in the later ones than the earlier - I must compare the speed, taken separately.

Do you agree?

Feb 4, 2012 at 4:22 PM | Unregistered CommenterShub

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>