03 June 2013

Building a climate model

Last Friday I mentioned a model, and will be getting to how it connects to Saturn's hurricane.  But some interesting to you, I hope, byways occurred to me. 

Let's start with the notion of a model.  Sometimes people quote sarcastically George Box's observation "All models are wrong.  Some models are useful."  Often they omit the second half.  And often they ignore the fact, well-known to any observationalist, that the same applies to observations. 

Models are idealizations of the real thing.  As an idealization, they don't represent reality fully.  This is mandatory for my kind of models.  Suppose you want to model ice ages, which span 100,000s of years.  A complete, non-idealized, model would be exactly an entire duplicate Earth, in a duplicate solar system, that we could control for our experiments.  Which might be fine as far as that goes, but would also mean we'd have to wait 100,000 years to see the result of 1 ice age experiment.  'Real time' modeling doesn't cut it for climate.  Or for weather -- if it takes 24 hours to make a 24 hour forecast of the weather, you really can't get much use from the model.

Being able to get an approximate answer much faster than real time is crucial to weather and climate modeling.  I backed in to this by way of some computer sciency experimentation I was doing.  Consider the important element being how much faster that you can get an answer than in real time -- how much 'lead' you can get.  One figure of merit, for instance, is to get a 24 hour model forecast or 'run' in only 1 hour.  This gives you 23 hours to make use of the model before the weather hits.  Obviously the more powerful the computer, the more computing you can do in 1 hour.  But this runs in to some other issues.
My simple model is only 1 layer.  Not clear whether this is the ocean or the atmosphere, as both are fluids (if you can wave your hand through it, it's a fluid).  Any realistic climate model has many, many layers.  One realistic circumstance where you have only one layer is ocean tides, which tipped the scales of my thinking in favor of the ocean.

Another aspect of a model is how large your horizontal 'cells' are.  That is, we average ocean velocity and surface elevation over an area, a cell.  The larger the area, the faster we can get answers (because the globe has fewer cells hence fewer computations).  But, of course, the larger the area, the less detailed our answers.  Real models always engage in a compromise between how fast you can get an answer and how detailed the answer is. 

For my really simple model on 1 processor at home, with cells that are about 16 km^2, I can get answers for 1 day in about 1 hour.  If we make the cells larger, say 140 km^2 (6' on a side), we get answers in about 2.5 minutes per day.  Call it 75 minutes per month.  If we're patient, we could run off a year in about 30 hours.  I'm seldom patient.  If we make the cells more like 2500 km^2 (30'), we can get a year's simulation done in about 6 minutes.  10 years per hour.  For climate purposes, 10 years is not very long, or even all that interesting, for climate modelling purposes.

But let's consider a significant climate period.  The Milankovitch cycles for climate are 20,000 years to 100,000 years (and some that are longer).  For those 2500 km^2 cells, it would be 2000 hours to get an answer about the 20,000 year cycle, about 83 days.  Ok, again, I'm not that patient.  Now, we've got 3 dimensions -- latitude, longitude, and time -- so we can predict pretty accurately that to simulate 20,000 years in 1 day, we need to back down our resolution (cell area and time space) to an area of 41,211 km^2 per grid cell (about 2.2 degrees latitude-longitude).  I've done this and will share the animations as soon as I figure out a good way.

Ok, not many of us are concerned about Milankovitch cycles and 100,000 year spans on climate.  I am, maybe a few of you are.  But this same principle -- you have to give up some detail to get an answer soon enough to be useful -- applies even more vigorously to weather prediction than to my simple model.  My model is only 3 dimensions.  Real numerical weather prediction is 4 dimensions -- latitude, longitude, time, _and_ height above the surface.  After all, we need to know just how thick those thunderstorm clouds are in order to decide how much rain we'll get, how likely it is there will be hail, and how likely it is that there will be tornadoes.

The difference is, we need 8 times as much computing to get answers on cells that are 50% the size for my trivial model.  We need 16 times as much computing to get those answers for a numerical weather prediction model. 

There's been some discussion about the huge increase that may be coming in the National Weather Service's main computer, which is large on one hand -- something like 2000 units as opposed to a current 150 units -- but the factor of 12 on computing power means only a factor of 1.9 on the grid spacing (the detail in the model output).  It's a big improvement, but not nearly what you might think by looking at the original figures.

No comments: