Mathematical models for newbies
Reader Andrew sent me his summary of the basics of mathematical models, which I think readers will find useful.
I have been devising mathematical models (simulations) of physical processes for over 20 years, and I just wanted to point out some of the basics that might help people understand these types of models:
1. The physics of the process (to be modelled) may be well understood, but although this helps it is somewhat irrelevent to the accuracy of all but the most simple model (although you will almost certainly not get a good model if you don't understand the physics). Nearly all computer models are based on mathematical formulae, commonly binomial expansions, that are representative of the physical situation. These expansions are typically of the form: A + Bx + Cx2 + Dx3 + Ex4 . . . and are truncated at some power of x (x representing the physical quantity under investigation, A, B, C etc. are calculated constants). I always tried to make it the x4 term, but this could lead to (in the 1990s) excessive calculation times (One commercial program, still in widespread use, truncates the series at the x2 term). Thus there is always a 'remainder term', or 'residual', which the model will (hopefully be programmed to) attempt to make an estimation of.
2. The problem, or 'domain', over which the model is to be applied (unless trivial) cannot be simulated as a whole. Thus it is divided into small regular shapes (squares, cubes, or more normally now triangles and tetrahedra, that are usually called 'elements') - a process known as 'gridding' or 'meshing', over which the (truncated) equations representing the physical situation can be relatively easily applied. Smaller elements usually produce more accurate results, but the computation time increases - and see 4 below. These can now be used to give a 'spot' or 'node' value for the physical quantity being simulated, for each of these elements (this is somewhat simplistic). A further mathematical process is now used to combine the results for all these elements. A single calculation through each element node within the domain is known as a 'sweep' or, more commonly, an 'iteration'.
3. Many iterations are undertaken until a programmed 'convergence' criterion is met. This is sometimes that the change in node value between one iteration and the next are all below a certain value, or that the residuals (see 1 above) are all below a certain value. This is generally known as 'convergence'. This process is somewhat easier for a 'static' situation where the physical values to be calculated are constant. If you add a time-based (dynamic) component, i.e. like the atmosphere, to the calculation it usually gets much more complex.
4. I hope it can be seen that this process is 'absolutely riddled' with scope for errors, incorrect assumptions, and erroneous simplifications. Not only that, the whole process can become mathematically unstable, due to interaction between the various steps, leading to the calculations 'exploding' to infinity, or crashing to zero. This is a particular problem with dynamic situations where the calculation 'time-step' can interact with the mesh/grid spacing, leading to the whole model 'falling-over' or collapsing.
5. Even if the model does converge to a solution - it does not mean that this is a correct (or accurate) one. In another commercial program (to that in 1 above), users are warned that an incorrect choice of the element type to be used - can lead to solutions that are up to 2000% (yes, two thousand percent) away from the correct value. One big problem with ascertaining the accuracy of computer simulations is that you generally have to have some idea of what the answer should be, so that you can compare the calculated solution.
6. Bear in mind that the process (simplistically) outlined above must be undertaken for each physical attribute being investigated, and it can be seen that this is a hugely non-trivial problem (for an atmospheric model).
7. In research work I have found that computer models are a very useful tool for qualitative analysis, but much less so for accurate quantitative analysis. The models I have worked on have generally been used for automated process control - and invariably these require 'calibration' or 'tuning' to real world measurements. Furthermore, these process control models are made so that any calculated solution outside the physically measured range is 'viewed with suspicion'.
Reader Comments (90)
For some reason the words 'teapot' and 'chocolate' spring to mind.
Now tell us how leaving out a variable (cosmic rays) affects the model?
thanks
JK
Absolutely!.
Point 1, that the physical model of the process is correct, is often ignored. What is often used in real problems is an approximation to the physical process - more scope for error.
I agree that the use of modelling, scientifically as opposed to precise control, is that it enables one to sketch out scenarios that give one some idea of what might be happening. Precision in modelling is, in my experience, illusionary.
Perhaps it should be emphasised that, if a physical process is not known and the output of a model is not what the designer expects or wants, it is a breeze to add a simple equation to make the model do what is desired. And the code is so complex that nobody else would be able to find the equation. Thís is how postive feedbacks are entered in the current climate models, to amplify the supposed effects of CO2.
Computer models are fine, but they never are a substitute for reality.
I find this statement absolutely mind-blowing. I mean how do you know what the "correct value" should be? I would have thought that the whole aim of the model is to open your mind to something you don't know. And 2000% is a value I can't even conceive.
That said Andrew, this mathematical newby appreciates your article very much. Thanks for that.
I worked on mathematical models (simulations) of thermal-hydraulic and other physical processes for over 40 years in the nuclear industry and agree totally with Andrew. We used hundreds of experiments to derive the sub-models and dozens more experimental results to validate the complete model. The boundaries of usage of the models were strictly defined by the range of conditions of the experiments. We did blind calculations and double-blind calculations. Convergence, both temporal and spatial, was always a major concern. We knew the model results would have a large uncertainty and so, in safety calculations, we always applied the models with a large uncertainty margin.
One only has to look at the application of CFD models to know that physical models and wind tunnels are always used because of the large model uncertainties.
My faith in climate models would rate close to zero in comparison.
This all seems a bit too hard for Geographers.
"One big problem with ascertaining the accuracy of computer simulations is that you generally have to have some idea of what the answer should be, so that you can compare the calculated solution."
Indeed, in structural engineering, one needs to know the order of size an element should be, otherwise one never really knows if it is correct. Therefore I always tell graduates to use a thing called a pencil, & some stuff called paper, to sketch out & do a back of fag packet calculation to see the orders of magnitude one is looking at! However, there is the possibility in puter modelling, as Albert Stienstra has observed, of fine tuning it to produce the answer one is looking for. I pointed this out to a Met Office "scientist" a few years ago down the local. He seemed positively non-plused by my observation as if I was speaking Greek & or heresy!!!! He seemed to be unable to grasp the fact that he & his colleagues had to tell their little big puter to do what they wanted it to, it wouldn't do it all by itself!
Last year the Virgin F1 was designed purely with CFD (Computational Fluid Dynamics). They believed no wind-tunnel testing was possible.
CFD can tested against REAL models in wind tunnels and flow systems. It is state of the art modelling. Huge investments refined over many years. Known physics. Known mathematics. Focused application of knowledge.
The result? The car was an absolute dog.
Ah...
"you generally have to have some idea of what the answer should be"
And that would be: man is causing global warming. Simple really.
I started writing physico-chemical models in 1967. Two observations on Andrew's fine summary. (1) You can never be confident that you've included every effect that matters. It's the omissions that limit the usefulness of modelling. (2) The other big problem is the modellers themselves. Modelling quickly became a specialised business, performed by people who didn't perform experiments or make observations. This distanced most of them from reality and they became more and more focussed inwardly on their numerical methods instead of outwardly on the physics and chemistry that it was their job to simulate.
I have come across "predictions" that simply made no sense - that violated fundamental physical laws or mathematical requirements. It's common to come across predictions that are never compared intelligently to measurements. Almost never do you come across successful predictions that were made from first principles: almost always there has been essentially arbitrary selection of parameter values to deliver an acceptable prediction. Then you must ask "Acceptable to whom; for what purpose?"
Alan the Brit mentions the 'scientist' down the pub who was bemused by the notion of doing a rough back-of-fag-packet estimate before embarking on detailed calculation.
I'm sneaking up on this explanation of the malaise of climate science: fragmentation. There is a modern tendency to become so specialised that the big picture is lost. Polymaths seem to be a relic of the past, at least in the eyes of the upcoming generation.
The chappie behind the excellent Notalotofpeopleknowthat has been challenging NASA GISS on the outrageous revisionism going on there - e.g., knocking two whole degrees off Reyjkavik's temperatures in the 1940s giving the appearance of a modern warming trend. He displays the emails he gets from GISS saying, "yeah, we did that!". Doesn't the GISS guy realise that he's spilling the beans on a scientific fraud? Well, no, he doesn't. A polymath would either stonewall for 'the cause' or yell 'mea culpa - our dodgy data is creating a spurious warming trend for the Arctic'.
Watch this space: http://endisnighnot.blogspot.com/#!/2012/03/giss-strange-anomalies.html
The forensic activity some of us are engaged in is revealing the detail of 'adjustments' of the historical record.
I've had 30 odd years of involvement with rock mechanics models, an area with similar levels of uncertainty and complexity as climate models.
Many a time I've been assured that a model has been calibrated to real world observations. What is actually meant by this is that real world, laboratory derived values have been used as input and these give results that do not accord with actual measurements. This is to be expected even if the model was a true representation of all the processes involved (which it ain't due to aforementoned uncertainty and complxity), as a rock mass has different properties to the rocks of which it is comprised. The modellers then "calibrate" the model by introducing correction factors to some of the input parameters that are simply designed to get the model to produce the right result. The calibration does not consider the various reasons why the model output and measured observations differ as this in itself is too complex to properly account for. I am then assured that this calibrated model can be used in future similar situations in nearby areas even though.
My primary uses of models are to produce pretty coloured outputs that some find more convincing than the output of my brain and to illustrate a point. I have never, and never would, use the output of a model as the sole basis for a design. If the model differs from other sources of knowledge, often experience and past measured observations, then I assume the model to be wrong. Not all feel the same way. I remember my old boss nearly suffering apoplexy as a young modeller assured him that a 30m stope was not stable. "But I can take you underground and show you 30m stopes standing up" he assured the modeller, expecting that to be an end to the matter. Alas, the modeller was not to be so easily dissuaded that his model was wrong and so we had a surreal pantomine of "oh yes it is stable", "oh no it can't be stable" as my boss went steadily redder and redder.
'Er indoors was a primary school teacher for most of her adult life and she always taught classes to look at the result of a calculation and see if it made sense. 2*3=6, so 2.3*3.4 is unlikely to equal 782! Try again.
I think the real problems arise when the answer turns out to be not just what you expect it to be but what you want it to be!
This all rather reminds me of Babbage's questioner - the one who asked whether, if he put the wrong numbers into his calculating machine, it would still produce the correct answers?
Sadly, I suspect many politicians wouldn't see anything wrong with that question.
Our Calibrated Model has No Predictive Value
"One big problem with ascertaining the accuracy of computer simulations is that you generally have to have some idea of what the answer should be, so that you can compare the calculated solution."
Sums up the problem. Is there anybody in the alarmist movement who didn't "know" what the answer was before starting?
Many years ago, I was attempting to generate a rather simple model for a biological project. I presented all the variables I had measured and asked "What would happen to the model if I had left out an important one?".. "Don't worry" came the reply. "The importance of all the other variables will adjust to make up for the deficiency". Always worried me a bit thereafter because I probably had left out one or two quite important ones, as one is inclined to do.
Seems odd that the 23 models are in agreement on hindcasting, but in disagreement on forecasting ....
Have they
(a) only just recently started disagreeing on forecasts, or
(b) did they in the past also disagree on the then forecasts?
If (a), why ?
If (b), were they all just tweaked into agreement ?
http://www.guardian.co.uk/environment/georgemonbiot/2009/mar/10/coal-miners-strike-scargill-opencast-protests-emissions
Do you think Metropolitan Eletist Islingtonish George Monbiot Hates White Working Class People
"although you will almost certainly not get a good model if you don't understand the physics"
I wonder if anyone told the climate modellers?
I can describe Weather and Climate prediction in 3 words TOO MANY VARIABLES
With Climate Change the one most inportant Varible is SELF INTEREST
Boeing likely knows as much about making and using models as any organization in the world. Yet, they spent about 15,000 very expensive hours in wind tunnels during development of the 787. And to date they have completed over 5,400 hours of test flights in nine aircraft.
Docbud
Interesting. Just after university I worked a little as a geophysicist. One of my roles was to look at pages and pages of graphical output of seismographic data, and pick the most likely points representing speed of sound. I was really tweeking or tuning a model, picking in a way a computer could not be programmed to do. The results were then of course compared with reality when cores were taken and exploratory wells were drilled, but even before that the resulting 3D geological map was analysed by geologists, who used real-world experience to confirm whether it was realistic or not.
If we were to work like the climate "scientists" we would have taken the computer models and used them to decide where to start production drilling. Outside the area surveyed. Without any hope of finding enough oil to cover the cost of the rig. Even then we would have had more justification, because at least our techniques were proven in the real world.
Somehow I cannot help being reminded of a rather cunning computer program called "Reason" which features in Douglas Adams's Dirk Gently stories.
Hi,
It is amazing the number of commentators here who really have used modelling and know its limitations.
It is also interesting that they are all from practical engineering fields!
I started my engineering career in the '60's, at Bristol Aircraft guided weapons, in those days we might get one run on the ONE E.E. "Deuce" computer that Bristol Aircraft had, or the Eliot patch panel analogue machine, if it would stay up long enough, apart from that we used Fredon (sp?) mechanical calculators, log tables or slide rules. I still have my Otis King model L, cylindrical calculator, with scales 66 inches long. It was a standard practice to always guesstimate, (fag packet calculation), what the correct size of the answer should be, so as to get the decimal point in the correct place. Most of the missiles I worked on actually flew OK, so we were doing most things right.
Last weekend I bumped into the neighbours son, who is reading mechanical engineering at uni, I inquired how was he getting on, to which he replied he was exhausted, he had been working for hours and hours on a course work project involving designing a boat propeller. How do you go about that I asked, he replied he had fed the required parameters into a CAD program and it produced the design. So I asked about cavitation and the limiting ratio of pitch to speed, to avoid cavitation, I received a blank look, I think cavitation was an unknown! I got an even more incredulous look when I said that I had similarly designed a prop in my college days and drawn the blade profile up using geometrical construction on the drawing board.
He had an absolute belief in the output from his CAD program, after all the computer (or the S/W) never lies in his world.
It is really scary!
P
I have a general comment to make about climate models.
GCMs are computer programs financed by the public purse. Every serious computer program that I've ever come across (though not in scientific computing) has a degree of free-form documentation associated with it, no matter what its purpose. Principal items of documentation include system documentation (an outline description), program documentation (a detailed description) and user documentation (how to run the program).
Each of these documents has its own characteristics and purposes. The first, the system spec., gives a rough outline of what the program does, where its input comes from and where its output goes, and gives program structures in fairly wishy-washy hand-waving outline terms.
The level of detail in the second, the program spec., should be sufficient to allow a new member of the programming team to pick up the essentials of the program in conjunction with the source code. It could be argued that it is possible to reverse engineer the program from a program spec. but that's never possible in practice – the spec. will not be detailed enough to do this.
The third document, the user documentation, will specify how the program is run, including the parameters that should be input – dialogs for example - and the output that should be expected – specifying what the output actually means. In the case of climate model, this spec. will include descriptions of fudge parameters and their intended purpose. Online help information associated with this program would ideally be documented here too.
The point of my comment? Whilst it will probably be argued by those wishing to hide their activities that disclosure of the program and system specs. would allow “competing academics” to pinch their ideas, that argument is not in the least tenable regarding documentation intended for the end user. There is no way to reverse engineer a climate model from the detailed description of its input and output! Nevertheless, this document would give valuable insight into the level of fudging that these models are designed to support.
So my question is, have FOI requests ever been made specifically for the user level documentation of climate models?
@punksta: it is (b), and all tweaks were different
Great stuff. This is surely a very useful post. Let me have a shot at making models even simpler – at, if you like, a pre-mathematical stage (but not without simple arithmetic). I have done a lot of process analysis and some process modelling for industrial processes, and in each case would usually start out with post-it notes and a general definition of the primary purpose of the process and the main steps or stages. All this goes on to post-it notes and gets stuck on a wall in left-to-right sequence. That’s a simple model right away, and a great one for discussing the process with operators and engineers – bits of data about rates, volumes, trouble-spots, queues, and so on can be added, as can inspection and rework loops. Japanese and American industries have in particular made huge gains from the use of such models to help encourage the clarification of problems and stimulate solutions. Occasionally I have created software models, from simple spreadsheets capturing the details on all the post-its, through to graphical simulations with statistical distributions of arrival and completion times, as well as estimated failure rates of equipment or periods when technicians were not available, holiday rosters for key operators, and so on. These were also great for discussions, and for showing before and after scenarios to management, and for helping identify good opportunities for cost-savings or additional expenditures.
Very commonly, people wanted to add more and more detail, to try to make the models more and more ‘realistic’. Very commonly, the data and the insight did not actually exist to make this feasible and so one was reduced to making guesses. The complexity of the software goes up very rapidly, and the fragility of the model with it unless you are very thorough indeed. Suddenly the model becomes a focus of huge amounts of effort, and risks becoming an end in itself. You have come a long way from the post-its on the wall that everyone could see and understand. But it was rarely a progress and I soon advised clients very strongly against jumping into such complexity without a great deal of thought, and even then to do it in small steps carefully checked for learning on a regular cycle of Plan, Do, Study, Act which begins with specific verifiable objectives, and which includes a careful evaluation of data throughout – including wherever feasible a technical evaluation of errors in any measurement or observation process used to get the data.
The post above also triggered a memory from when I was studying atmospheric physics in London, about a lifetime ago, and we had an opportunity to attend a series of lectures on programming. The lecturer got our attention right from the start by words to the effect that you should not use a computer to compute anything that you don’t already know the answer to, or can't find out by other means. Now that’s a bit severe, and I may not be remembering it correctly, but the message was clear: complex software is a can of worms, and very careful verification is required. Since then, I have spent a little time on statistical models for defect densities in complex software, and on the problem of establishing the quality of it. From that, I took the lesson that the way to go is to have tremendous levels of discipline in building and checking the code, testing and verifying at every turn. The software mess revealed in HarryReadMe was I think for the relatively simple task of data quality control and homogenisation – relatively simple compared with GCMs for example, but not necessarily easy to do. If anything like that lack of discipline and quality control has gone into GCMs, then for that reason alone they would not be fit to be allowed out into the world of practical decision-making. Perhaps they have been thoroughly audited by independent experts? I don’t know of that having happened. I do know it would have happened if they were to be used to provide support for a new headache treatment. But standards seem slacker when it comes to providing support for wrecking economies, scaring the children, and creating massive, potentially totalitarian, government agencies.
The essential problem with all climate models is that there's simply no way of verifying whatever predictions they're making. Whatever they're telling you is simply not falsifiable. If it fails Popper's rule, it's not science, it's a belief.
Expressed in less formal terms - http://thepointman.wordpress.com/2011/01/21/the-seductiveness-of-models/
Pointman
johninfrance said "And 2000% is a value I can't even conceive."
You've obviously never borrowed money from a pay-day loan company.
Regarding climate models,
1. We don't know what the dominant physical processes determining the climate are (solar variations, cosmic rays, statistical fluctuations, chaotic oscillations, or even the change of proportion of some gas from 0.0003 to 0.0004).
2. Even if we did know the physical processes we don't know how to parameterise them accurately (we don't know how the sun's going to vary, or how aerosols affect cloud albedo).
3. Even if we did know 1 and 2 we would not be able to predict anything because the system is chaotic.
4. Even if we did know all the controlling equations and parameters we couldn't solve them properly because of the vast range of length- and time-scales involved, hence major fudges and simplifications are used. This is Andrew's point 4 - the code 'crashes' unless you add large artificial damping.
(From maths field, not engineering, grumpygrandad!)
One thing you'll notice about academic climatologists is they never prove their ideas with physical emulators. You will not find any demonstration of an added 10% (33C) heating from "back radiation" that will work in your living room (or a lab). I used to work for a company that filled giant tanks with salt water with various stratifications. Then a submarine-shaped model was towed through it. The surface turbulence was photographed and analyzed in great detail (for obvious reasons). A lot of expense and effort was invested to duplicate the ocean as closely as possible. Ever seen anything like this to prove 'Travesty' Trenberth's energy balance? Academic climatology is a great place for activist flimflammers to gather and hatch their plots.
I am moved to repeat a formula for modeling from my post a few days ago at Judith's site.
One of the important things we learn from climate models is how to build your own climate model. Here is an easily-followed step-by-step do-it-yourself recipe, largely an extension of the splendid material to be found in the “Harry Readme” portion of Climategate I.
1. Learn a little Fortran. (But don't bother to get overly familiar with Excel).
2. Learn pi to at least three significant figures (necessary for the next step).
3. Divide the earth's atmosphere into many equally-sized tiny cells, making sure that the total number of cells exceeds the capacity of anything that will fit on on a personal computer.
4. Find some meteorological data to stick into the cells. (See Harry about this)
5. Insert an algorithm to blow data out of each cell into the next, preferably from the Southwest. Now you have the beginnings of a Global Circulation Model (GCM).
6. Make up some more algorithms to heat and cool cells (look in Wikipedia under INSOLATION and RADIATION). Spend some time doing this, so that you are able to follow Dr. Curry's advice to create the “perception that complexity = scientific credibility; sheer complexity and impenetrability, so not easily challenged by critics”.
7. Find a supercomputer, or get a U.S. government agency to buy you one. (Best to do this before November 6, 2012). Test-run your new CGM, adding fudge factors so the output for future temperatures wiggles up and down a little.
8. Now inject special new algorithm, so that global temperature output wiggles with carbon dioxide concentration (see Wikipedia or take a trip to Hawaii to get this). Set this fudge factor to at least 3 for each doubling of carbon dioxide.
9. Back-test your GCM, making output more-or-less match 1970-98 historical patterns. Add nudge factors as necessary. To get anywhere close to 1940-70 reality, you may have to invent some new historical data on sulfur and soot -- call this the smudge factor.
10. Do some more computer runs out to 2100, and mail the outputs to IPCC.
Now wasn't that easy?
It is extremely difficult to get students (and many practitioners) to view computers and computer output realistically. For many years, I taught environmental engineering students how to model biological processes. If only we had anything like the degree of established science that computational fluid mechanics has!
I included traditional rules of thumb for judging the reliability of computer projections. I don't know how many times my teaching evaluations contained complaints about these old-fashioned things.
In my field, computers provided qualitative suggestions, even when calibrated and verified. They were used by EPA regulators to set stream standards and effluent permits, but experience regulators didn't take them too seriously. Many academic researchers, however, swore by them.
Dear grumpygrandad, Those were Freiden calculators, and they cost $2,000 or more in 1965. At my school, we only had Monroes, plus seven-place log and trig tables, slide rules, pencil and paper. We did have IBM 1620s, but they wouldn't accept more than 300 punchcards. When I began teaching in the 70s, the big issue was whether we should allow students to use HP 35s. One of those cost $400 and only did logs (base 10 and natural) exponentials, trig and hyperbolic trig functions. Benign neglect solved the problem.
This is a fascinating thread - allow me to dilute it a bit.
Dad was EE, U of Minnesota '41. He used to like to describe two classes they taught in those days. One was modeling which included designing various systems at scales wildly different from usual, and the other was called "guessing" by the students and wasn't limited to decimal point location for slide rule calcs. The idea was to ascertain the accuracy - (order of magnitude or maybe spread) of the required result and then devise a way to arrive at it. I think each of these courses was a semester, may have been options, and were, according to him, the most useful he had at engine school.
As an industrial architect, i often found myself first guesser, the guy who had to imagine what the thing would be like before we engaged the guys who could actually design the process systems. We weren't always able to get people who had either done it, or could do it and the most frequent problem was that an answer (design) was thought unacceptable to them and couldn't be pursued when they were unable to refine it to the fourth decimal. They'd lock up.
Don't ask why my purview wasn't restricted to designing houses for the process stuff.
Very frequently, these things only needed to work. Their relative efficiency would be lost in the shuffle, and yet I couldn't get these guys to accept that a "sorta" answer could sometimes be ok, especially if there was no reliable way to get a better one.
It may be that "sorta" has gone the way of the slide-rule, at least in the minds of younger engineers.
Here's something to add, Tetragrammaton. They don't just use one run. They use "ensembles" of runs. If you don't like the "garbage out" then mix in more garbage. Check this out:
Ensembles
Climate models are an imperfect representation of the earth’s climate system and climate modelers employ a technique called ensembling to capture the range of possible climate states. A climate model run ensemble consists of two or more climate model runs made with the exact same climate model, using the exact same boundary forcings, where the only difference between the runs is the initial conditions. An individual simulation within a climate model run ensemble is referred to as an ensemble member. The different initial conditions result in different simulations for each of the ensemble members due to the nonlinearity of the climate model system. Essentially, the earth’s climate can be considered to be a special ensemble that consists of only one member. Averaging over a multi-member ensemble of model climate runs gives a measure of the average model response to the forcings imposed on the model.
The CCSM IPCC simulations consist of 26 different runs with the CCSM3
• One 500-year control run
• A 5-member ensemble simulating the 1870-2000 historical period
• Four 5-member ensembles corresponding to the IPCC A2, A1B, B1 and constant-20th-Century-forcing future scenarios.
The control run defines the long-term climate of the simulated system. This simulated climate will be slightly different from the actual climate of the earth. Five 1870-2000 simulations were branched from five different points in the 1870 control run and four IPCC scenarios (i.e., A2, A1B, B1 and 20th-Century-constant-emissions) were run into the future starting from the end of each of the five different 1870-2000 historical simulations. The following figures step through this progression.
http://www.gisclimatechange.org/runSetsHelp.html
Nice summary Andrew. Some additional points about models that might be worth mentioning (techy, but I see we have a number of modellers here), and are certainly important in the models I use:
1. The objective function is a more explicit way of stating that the model has converged. Many models are not run in isolation, they are run to optimally estimate some parameters that would explain the real world observations or data. The free parameters are "tuned" until a reasonable fit to the real world data is obtained. A linear regression line is a simple analogy for such a model. Under the assumption that y is linearly related to x find the parameters slope and intercept of the line such that the data misfit or residuals in y are minimised. In least squares linear regression the objective function is then the sum of the square of the differences between the model predictions of y and the actual measured values of y, given x.
2. A forward model is where some measurements are known and another type of measure is simulated. This is commonly used in geophysical applications (where I work). If we know the velocity and density profile of the earth in an oil well then we can forward model what the seismic reflection response would be for that location. In the geophysical case the assumptions of the forward model are clearly stated and so the geophysicist immediately understands the limitation of the synthetic response obtained from the model. In geophysics these types of models are well understood and quite reliable. A very important point about forward models based on physics of this type is that the synthetic data generated for a given measured input data set is unique.
3. An inverse model is where we try and deduce the physical properties of the system by using the observations. Inverse modelling usually includes a forward model (to simulate the result from the current model), an objective function (to evaluate how closely the model prediction fits the actual observations/data) and an inverse engine that updates the model to improve the fit (hopefully achieve convergence). Inverse problems can be very simple or very difficult to solve. The two big problems with inverse models (and I think this applies to climate models) are:
(a) Non-uniqueness is the situation where many different models actually give almost identical synthetic (forward model) output. This is almost universally the case in geophyiscal problems (of which climate is an example). There is a huge range of model parameters that could all explain the observations quite happily and you cannot know which is correct. I believe this is the first problem of climate models - you could get just as good a fit to the observations by removing the anthropogenic part and changing other (natural) parameters.
(b) local versus global solutions is the way in which the inverse solver converges. In a well posed problem the solution space is ideally very simple and unique - typically a parabolic misfit function solved with simple mathematical tools. In a badly posed problem there may be multiple good fits (non-uniqueness) but also there may a unique solution but it is not found because the parameter range tested is too narrow and so a fit is obtained locally within the bounds of the tested parameters, but this is not the best fit possible. If the problem is very badly posed then a highly non-linear solver may be required such as simulated annealing (brute force trial and error by computer). To find the unique best solution may then require far too many iterations/too large a range of parameters to be tested and so the runtime becomes extreme because the problem is very high dimensional. This is almost certainly the case with climate simulations. This is where supercomputers come in. However a supercomputer, whilst great for making modellers feel like "masters of the universe", is not the solution really. What is needed is a better understanding of the problem - something clearly lacking in climate simulations.
Richard Betts?
You're right, I've always tried to pay my way. But what I was referring to was 2000% error.
Grumpygrandad:
"I started my engineering career in the '60's, at Bristol Aircraft guided weapons"
You must have known my father who passed away last year. Can our host put us in touch or do you have an email address that you are happy to have in the open?
@ThinkingScientist I believe this is the first problem of climate models - you could get just as good a fit to the observations by removing the anthropogenic part and changing other (natural) parameters.
It is routinely claimed by the Gavinites, that only by adding the anthro-CO2 effect can a fit be obtained.
Mar 20, 2012 at 12:21 PM | Speed
To be fair to Boeing, not all the flight test hours will be directly attributable to aerodynamics (the wind tunnel testing).
Punksta:
"It is routinely claimed by the Gavinites, that only by adding the anthro-CO2 effect can a fit be obtained."
I know, but given the number of free parameters available in the model I don't believe that claim. They are implying a level of uniqueness which is surely not possible with such a poorly understood and ill posed problem (initial conditions alone must be a nightmare as there is no static startup condition as there is in say an oil reservoir engineering model). This claim is also predicated on the assumption that the measured data/observations are also correct. I don't think that is true either - see the recent post at Judith Curry summarising problems with the corrections to hadsst3 for an example of this.
http://judithcurry.com/2012/03/15/on-the-adjustments-to-the-hadsst3-data-set-2/#more-7719
Alan the Brit
I smiled when I read this because I remember the original modeling tool -- a slide rule. Yes, I am old enough to not only remember them, but how to use them. While you could use them to get two or three significant digits in the calculation, keeping track of the magnitude was your problem, which we did on the back of envelopes as we were too young to smoke.
As for Andrew's essay on the dangers of modeling, I would add that after a couple billion calculations your computer probably accumulated a significant amount of truncation and rounding error even with 64-bit math. In my experience, 32-bit was totally useless unless you were very careful in how you did the programming. Nowadays, that is all a lost art and I often wonder how many of these "important breakthroughs" in modeling are simply rounding error.
Mar 20, 2012 at 10:12 AM | johninfrance
"I find this statement absolutely mind-blowing. I mean how do you know what the "correct value" should be? I would have thought that the whole aim of the model is to open your mind to something you don't know. And 2000% is a value I can't even conceive."
Excellent question. Models, when used properly, are used for analytic purposes only. The process being modeled has to be well understood apart from the model. I think you might have believed that models can be used for scientific purposes. Models cannot be used as substitutes for scientific laws or well confirmed hypotheses. Science is qualitatively different from modeling.
Mar 20, 2012 at 12:38 PM | allchangenow
An example:
http://www.giss.nasa.gov/tools/modelE/modelE.html
j ferguson
Hopefully, he will pop on by. Maybe we can catch him in the pub if he doesn't make it here.
Richard Hamming (1995-1998) was a mathematician who worked on the Manhattan Project during WW2. His contribution to that project was simulation of a nuclear explosion to predict if the bomb would ignite the atmosphere and thus destroy the planet. This was a major concern of the physicists at the time. (Talk about global warming!) In the dedication of his text, "Introduction to Applied Numerical Analysis" he states, "The purpose of computing is insight, not numbers."
In my 35 years of engineering practice, I have worked with mathematical models of a variety of physical systems. I still have my Post Versalog slide rule, and my HP-35 calculator. They are, however, relegated to a glass-fronted wall cabinet, with the sign, "In Emergency, Break Glass." I find common sense and a sense of history to be useful in evaluating any modeling prediction. I don't reject GCM as tools to gain insight, but when they are used to generate worst case scenarios that magically become predictions of future events, I see other driving forces at work. For the scientists doing the work I see the interference of job security concerns and the desire for fame or fortune. We all need to be mindful that an array of scenarios does not have predictive value that can be quantified.
My reply to punksta above actually reminded me of an important point about models - start conditions. I work as a geophysicist and I am heavily involved in building models of the geology in the sub-surface to describe a reservoir. The model is populated with properties in each cell of the 3D model. These properties are educated guesses. Uncertainty is represented by stochastic simulation ie ranodm numbers. The model is called the static model - it is the framework that might describe the earth.
These models are given to the reservoir engineer to perform a dynamic simulation. In the dynamic simulation the reservoir engineer places possible wells, and using the physics of fluid flow simulates how much oil the reservoir might produce, how quickly it could be produced, what percentage of the oil in place might be produced and so forth. Many runs, using different initial static runs are tested to build up an idea of the uncertainty in oil production (and commercial viability) as a function of the geological uncertainty.
This basic idea is also how climate models are run, I believe.
However, there is a fundamental difference in the reservoir engineering models. The intial conditions. Although there are exceptions, in almost every real world situation for the reservoir engineer they know that the oil sitting in the reservoir is at equilibrium in the earth. So they test this for their model - they take the static model and start the time step calculation process for fluid flow without putting any wells in at all. In climate terms, without any external forcing. And they observe the model to see if when they initialise it does the oil start to move about? Clearly in a good model, when initialised, the oil just sits insitu until some external forcing is applied. The start conditions for the model are therefore known and this is a powerful constraint on the modelling.
In a climate model the start conditions are unknown - there is no initial static condition we can assume and run from there. If you don't know the intial conditions (temperature, pressure, etc etc) how can yo u know your model is starting at the right point? This problem can be seen very clearly in weather forecasting. Let's say you start with a nearly complete picture of today's pressure, temperature, humidty etc condition in every model cell. You then perform the time step calculations, but if there is even a tiny error in the intial conditions then very quickly the forward prediction becomes more and more in error becasue the model is highly non-linear. Hence short range forecasts are ok, long range forecasts are very poor if not useless.