4. Validation Results |
Quality | Local | Regional | Other Models |
4. VALIDATION RESULTS
The approaches to validation can be divided along the lines of spatial scale and assessment of the quality of the input data. The discussions in this section is so structured.
The quality of the input data to a large extent will determine the quality of the final product. The input data from the ISLSCP CD-ROM have been independently assessed by a review group (Kerr et al. 1995). Details of this may be found on the CD-ROM and in Sellers et al. (1996). The soil data used in the GSWP however, presents new problems in its application, as the soil hydraulic parameters generally scale very non-linearly with soil moisture and show great spatial variability. In an effort to identify potential errors in the input data an assessment was made of the quality of the 1×1 gridded parameter fields with high resolution parameter maps derived from high resolution soil maps over Europe (Dolman et al., 1997).
Aggregation of soil hydraulic parameters is possible within certain limits (Kabat et al. 1997). In general it would tend to conserve some of the spatial variability as opposed to taking a dominant class (Noilhan and Lacarrére 1992). This issue is not without contention. Kabat et al. (1997) suggested that soil hydraulic parameters can be aggregated in a similar manner as proposed for vegetation parameters (Dolman and Blyth 1996). This implies that the parameters are weighted by their fractional coverage in the domain and averaged according to the structure of equations in which they appear, i.e. logarithmically, reciprocally, linear etc. Boone and Wetzel (1998) illustrate the potential impact of linear versus non-linear averaging of soil properties on simulation of the surface water balance.
Figure 9 shows a comparison of the parameter Ksat for the ISLSCP CD-ROM field and a field derived from the European Soil map (Lilly 1995). It is clear that considerable differences can be found between these two approaches. The effect of this on the water balance is the subject of ongoing research.
Figure 10 Winter available
soil moisture content calculated from ISLSCP soils data (top) and the maximum
available water content derived from the EU soils data set (bottom). Units
are mm.
The input data from the EU map can also be used to assess the overall validity of some of the estimates. Dolman et al. (1997) used information on rooting depth and soil type to calculate physical limits of available water for the European domain. Figure 10 shows a comparison of winter soil moisture (assumed generally to be at its maximum) and maximum soil water availability according to soil type, underground and slope. Rooting depth is the prime determinant of available water content, so the high resolution data effectively provide an upper limit to the available moisture content. It is clear that considerable improvement may be made to the input data, as at high region, with rocky sloping underground, the rooting depth in the ISLSCP CD-ROM data appears to be too deep. This is visible in the region of the Alps and Scandinavia.
Zhang et al. (1998) found evidence that the
specified surface air temperatures over grid points in Russia where soil
moisture and meteorological observations were available are cooler than
observed during winter and spring. This may have contributed to errors
in the simulation of snow accumulation and the timing of melt in SiB2,
and may have also affected the simulation of the annual cycle of soil moisture
at these locations.
Off-line validation of LSPs against single point datasets has been the norm rather than the exception in land surface modeling. The experience with PILPS has shown that considerable differences exist between various models. It is not the intention of the GSWP to duplicate the effort by PILPS, but we want nevertheless be sure that the predicted global fields of soil moisture are realistic, and moreover, are realistic for the right physical reasons. Two studies were executed in the GSWP validation framework using data from local point sources, one of them concerned a classical off-line study. We treat these studies here together, because they basically address similar questions. We also used point data from a field experiment in the Netherlands to check the range of predictions by the GSWP-LSPs.
It is to be expected that these differences translate into the model predictions. Matsuyama and Nishimura (1999) conclude that their model dries out too quickly. They suggest that the soil hydraulic parameters used in GSWP cannot be responsible for this and suggest that root water uptake is too strong as a result of drier than actual meteorological conditions in the CD-ROM data.
They further conclude that the use of CD-ROM forcing data jointly with in situ observations of soil moisture is questionable, as soil moisture reacts strongly to precipitation input, and small (or large) differences between observed and CD-ROM precipitation can lead to appreciable changes in modeled and observed soil moisture.
Entin et al. (1999) use four observed data sets from Russia, Mongolia, Illinois and China to compare plant available soil moisture with the GSWP. These data have been previously successfully used in PILPS and AMIP intercomparisons (Robock et al. 1995, Schlosser et al. 1997). They consist of measured soil moisture over a depth comparable to the rooting depth of agricultural plants and are generally taken every two weeks manually. Details on the data can be found in Entin et al. (1999). Figure 12 shows the spatial distribution and geographical location of the data. Figure 13 shows the plant available soil moisture for the 10 models used in the GSWP and the observations. The spread is tremendous. The wetter models are consistently wetter (100 mm) than the observations. The model range is roughly 150 mm from the driest to the wettest model. This is no simple correctable offset as can be inferred by looking at the Place model which is wet in the agriculture group, but belongs to the dry models in the forest group. In general the so-called SiBling group, SiB-2, SiB, COLA-SSiB and Place belong to the models giving on average wetter than observed results. The drier group consists of BATS, Bucket, Mosaic, SSiB-H and the ISBA model.
Chen and Mitchell (1999) compare the results for the LSP used operationally in the Eta regional forecast model of the National Centers for Environmental Prediction (NCEP; Chen et al. 1996, 1997) to observational soil moisture data from the Illinois soil moisture network (Hollinger and Isard 1994). The Illinois data are from a series of point stations on a spacing finer than the 1×1 grid of GSWP. They find that their LSP simulation of the phase and amplitude of the seasonal cycle of soil moisture aggregated over the grid boxes covering Illinois compares well with observations, and falls within the range of variability within the point observations. Comparisons between single grid boxes and nearest point observations do not match as well, but still appear skillful.
4.3 Regional scale measurements
Writing the regional water balance as:
dS = P - E - R
with P precipitation, E evaporation, R runoff and dS the change in surface water storage, it is possible to estimate P- E by the atmospheric water balance method (Oki et al., 1995). We can then derive the change in storage by comparing against observations of runoff of large rivers such as the Amazon (Matsuyama, 1992, Matsuyama and Masuda, 1997) or Congo (Matsuyama et al., 1994). It is worth noting that S represents the total change in water storage, including snow, groundwater table changes, lakes and water in flooded plains. The accuracy of this method restricts its use to large river basins having a relative dense network of radio soundings. In practice, the analysis products of the large weather centers are often used.
In the GSWP, Oki et al (1999) use the runoff of large rivers directly as a means to assess the quality of the GSWP-product. This is a logical step in the context of comparison with and Land Surface Parameterizations, as these models produce also runoff by various mechanisms (saturation overflow, etc.). Oki and Sud (1998) produced a 10*10 global river channel network (TRIP; Total Runoff Integrating Pathways) that allowed comparison of the GSWP output with the global runoff observations.
Figure 14 Mean annual
runoff by 11 LSMs (top) and observed annual runoff (bottom) over each drainage
area for 1988. Units are mm y-1.
A crucial element in this comparison is the assessment of the quality of the data. In Figure 15, the discrepancy between observed and modeled runoff is plotted as a function of the density of rainfall gauges. The scatter in the bias of the estimates is large for areas of low rain gauge density and decreases as the density increases. Unfortunately areas with high rainfall gauge density are sparse over the globe. The minimum density of gauges required to achieve some meaningful comparison between model and data is 30-50 gauges per 106 km2. This graph dramatically illustrates some of the difficulties in comparing global models with observations. Only in small parts of the world, enough data is available to do this successfully. In large parts of the world, model predictions are our only source of information. Nevertheless, the relative RMS error of the 11 schemes participating in the GSWP is 40% for runoff and 18% for annual evaporation. This is comparable to PILPS results, but it should be borne in mind that these refer to local comparisons only. Using the river routing model, the monthly runoff was also compared to the observations. Using this model considerably improved the monthly comparison, and the system show good promise for application in coupled models.
Zhang et al. (1998) also examined basin scale
runoff as a means of validating the performance of their SiB2 model over
large scales. Using the routing formulation of Miller et al. (1994) with
the TRIP routing map, they try to find on what spatial and temporal scales
their simulation of the surface water balance compares well with observations.
They find that SiB2 underestimates the discharge for large river basins,
while underestimating small river discharges with a net bias toward underestimation
of about 10%.
4.4 Validation against other model estimates
One further option to assess the usefulness of the GSWP is to compare it against rather crude, but simple equations relating evaporation to the annual water and energy balance constraints. Such a well known equation is derived by Budyko and starts from:
Off line comparison of land surface models with local data has been the traditional way of looking at the performance of LSPs. A critical question is how the use of independent meteorological forcing and the absence of feedbacks with the atmosphere corrupts the results. Koster et al (1998) approach this issue by comparing the outputs of their LSP with a simple empirical description relating precipitation to available energy and evaporation, developed by Budyko.
Chen and Mitchell (1998) used the NCEP reanalysis
products for 1987 and 1988 as a means to validate the performance f their
LSP. They found that compared to reanalysis soil moisture, their GSWP run
generally has a smaller annual cycle, however at the middle and high latitudes,
the NCEP reanalysis has a wetter and less spatially variable distribution
of soil moisture throughout the year. These differences exist despite the
similarity of the GSWP atmospheric forcing fields to reanalysis, and the
similarity of the land surface model used in both. It should be noted that
the NCEP reanalysis procedure includes a damping of soil moisture to the
climatology of Mintz and Serafini described by Mintz and Walker (1993),
instituted to prevent drift in the reanalysis model to a dry regime. This
may contribute to the uniformity of soil moisture, and an apparent lack
of interannual variability in the reanalysis.