GSWP-2 Data FAQ

Contacts
Basics
DODS/GDS
NetCDF
Fortran
Other Errors

Last update: 31 August 2004

We will update this FAQ as users submit queries and our experts find answers.

Clarifications, corrections, etc:

Soil properties table (1 November 2005)
Some of the properties in the soil properties table were swapped among categories. This affected categories 4-6 and 8 & 9.  The table has been corrected.

Time stamps of ISCCP radaiation data
(31 August 2004)
The 3-hourly surface shortwave and longwave radiation data from ISCCP have different time characteristics than the other radiation data (SRB, NCEP, and ECMWF).  NCEP and ECMWF represent a 3-hour mean, and the SRB data were originally instantaneous, but interpolated to a 3-hour mean for consistency with the reanalysis products.  From a GDS, a time period that represents the mean from 0000UTC to 0300UTC is labeled as 0000UTC, but the NetCDF file metadata label each record with the end time (e.g.,
0000UTC to 0300UTC is labeled as 0300UTC).
The ISCCP data have a time stamp in the NetCDF files and on the GDS that represents the middle of the 3-hour period, not the end as with the other radiation data sets.  Thus, a record labeled as 0300UTC represents the period between 0130UTC-0430UTC.  Strictly speaking it is not a time average, but rather the center of a 3-hour sampling window.  However, for our purposes, it can be treated as an average.
If the interpolation subroutine
drv_finterp is used, the time flags should be chosen thusly:
Access over GDS: "C" or "c" for ISCCP data; "N" or "n" for
SRB, ERA and NCEP.
Navigating NetCDF files using their metadata: "C" or "c" for ISCCP data; "L" or "l" for SRB, ERA and NCEP.

High near-surface air temperatures over Greenland  (13 November 2003)
Kenji Tanaka has discovered some unusual behavior in temperatures over some locations in Greenland (PDF of sample plots here).  The errors exist at least in 1983 and 1985.  We have not yet determined the source of the error, but the data will not be changed again.  If these errors cause problems in your model, please screen for unrealistic temperature swings in your model (e.g., a maximum gradient check between surface and air temperature).  Similar screens may be necessary for specific humidity at these locations as well.
Kenji has kindly provide some Fortran code to adjust the extreme temperatures at those gridpoints where there are obvious problems.  You will also need the grid mask for large temperature fluctuations.  A list of filenames, read by the program, is given here.  This code assumes you have the files on local disk, with one year's temperature data per file.

Units error for slope data  (22 August 2003)
Values for average land slope (Slope) were reported in the wrong units.  The original ISLSCP-2 slope data were recordedas an angle in degrees, and not as a percentage grade as we had believed.  So our conversion to fractional grade (tan(slope angle) or "rise over run") was incorrect, leading to values that were too small by nearly a factor of two.  This data set should be corrected in the next few days.

Zero values for some vegetation/aerodynamic fields  (22 August 2003)

This message is in regard to a problem found initially with the roughness length data (originally from the ISLSCP-2 data provided by Sietse Los) by Helin Wei at NCEP/EMC.  Roughness length values are 0 over land ice points, as well as over some severe desert points.  Because of the algorithm used by Los to calculate roughness length (based on vegetation coverage), points that register no vegetation in his data set were assigned no value for aerodynamic paramaters.  Some of the desert points may have intermiitent zero values (i.e., not all months have positive values at the same point).  The table below summarizes what Maggie Zhao has found from scanning the data sets:
1) Fapar: oceans=0 and ice=0.001
2) Green LAI: oceans=-0.1 and ice=0
3) Total LAI: oceans=-0.1 and ice=0.01
4) Roughness length (z0): oceans=-0.1 and ice=0
5) zero plane displacement (d): oceans=-0.1 and ice=0
6) Vcover: oceans=-0.1 and ice=0 (ONLY ONE FILE)
The best solution is to set a minimum acceptable value in your model code, consistent with your LSS.  This, we feel, is better than having us arbitrarily chaging values in the ISLSCP-2 data set.

Extreme values in some CSU albedo fields  (18 August 2003)
This message is in regard to a problem found initially by Helin Wei at NCEP/EMC.  Some zero values appeared over land, due to a problem in the approach to land-sea masking that was applued to the monthly dataset.  This has been fixed and rerun - new files have been posted to DODS and FTP servers for the file:
Albedo_csu.nc
These updates were made on or after 18 August.

Also, there appear to be some unrealistically high or low values to snow-free surface albedo in the CSU surface albedo field (monthly-varying total albedo). These appear to be in the original data - the high values are consistent with inappropriate screening for snow in some areas.  We have no theory about the low values.  Anyhow, as with the data sets described above, please apply checks for minimum/maximum acceptable values in your model code, consistent with your LSS.  



Very low values for SWdown (1 August 2003)

This message is in regard to the problem found with the shortwave radiation control forcing data (SWdown_srb variable). In the model running process, Chris Milly's group has uncovered some unusual behavior in this variable. For examplle, at the gridpoint (12.5E, 14.5S - coast of Africa) in March and May 1987, daily-average SWdown_srb is found to take a value less than 10 W/m2 for most of the time during these months, where no such behavior appears at adjacent gridpoints. Such unusual behavior might crash some models.
 
We have checked the files. We think this problem might be due to the missing value in the original SRB data since we have set the values of all missing points equal to 10 W/m^2 for high latitude bands in the data processing.
 
Apparently, the land-sea mask for the SRB data is not static in time (we assumed it was), since this problem is occurring at a coastal point.  We will look through the shortwave radiation data for similar problems.  If  anyone else encounters similar problems with the radiation data, please let us know, and we will attempt a satisfactory solution.  If this turns
out to be an isolated incident, we may just suggest a work-around.  
In the mean time, does anyone have a clever idea how to screen the SWdown files for similar glitches??


ALMA names and units of soils output data (30 July 2003)
Clarification and change to Table 11 to distingish the output soil properties of wilting point, field capacity and saturation in units of water depth (m) as opposed to the optional input parameters for these fields, which are volumetric.  The variable names have been changed from W_* to M_* to distinguish them.

Differences in timestamps for 3-hourly data:

Other minor glitches that have been repaired:
Averaging of output data
Clarifications have been made on the web documentation regarding the treatment of state variables (averaged for daily data, but not for the 3-hourly data).  All fluxes are averaged rates.  

Choice of land surface parameters
There have been questions about which parameter fields to use when more than one option is available (e.g., prescribing wilting point).  The answer: use your best judgement, whatever is most consistent between your model and the experiment.  Please document any choices you make and submit that information with your results following the guidelines for ancillary information (9 July 2003).


The Basics:

Proper use of DODS/GDS (P. Dirmeyer, 12 May 2003)
There continues to be some confusion about the proper application of the DODS technology. Hopefully this will clairfy some misconceptions:
This is a new data distribution technology, and GSWP-2 is one of the first experiments of this type (distributed modeling; centralized data) to be attempted.  There is a learning curve, and some time to invest up front to get the DODS libraries functioning properly.  Once you have your DODS client(s) (Fortran, C, IDL, Matlab, GrADS, Ferret,...)  working, a whole new data universe will be open to you.  For example, NCEP is making their operational forecasts available on GDS.  The web page http://cola.gmu.edu/grads/gds/index.html lists a few of the GDS servers that are available - in the US there are also GDSs at GFDL, GSFC and NCAR.  Unidata maintains a broader list of DODS data servers, and the Global Change Master Directory is now available via DODS.

What does GDS provide that DODS doesn't? (P. Dirmeyer, 5 May 2003)
There are 3 main advantages to the GDS over a generic DODS server:  
  1. Server-side analysis.  This is not particularly useful for modelers who are just accessing forcing data, but it could be very advantageous for analysis and comparison of the model results.  See the section titled "Evaluate expressions on the server side when appropriate" at: http://cola.gmu.edu/grads/gds/doc/user.html for more information on how to have the GDS do your number crunching and send back only the final results to you over the Internet.
  2. Templating of file names in GrADS.  GrADS data descriptor files allow for defining parts of file names as having meanings, such as time-stamps, that allow GrADS and the GDS to access a large number of files (e.g., a time series of data with each time in a different file) with a single "open" statement.  This was developed to access GCM output easily (e.g., forecasts from NCEP or ECMWF).  But it is also useful here because ISLSCP-2 (from which the GSWP-2 forcing data are derived) insisted on having separate files for each time step.
  3. Dual-access to compressed ALMA data sets. This is a new feature developed just for GSWP.  ALMA data sets in the NetCDF CF "compressed by gathering" format, where all water points are squeezed out, can be accessed as either the native land-only vectors or complete repopulated grids.  This is especially cool, and requires the 1.9 beta version of GrADS on the server side, but nothing special for the client/user.  The different access to the same data is accomplished by having two different data descriptor files (.ctl files) for the same data set.  One describes it as a vector (e.g., for model access), and the other describes it as compressed grid and gives the "pdef" to uncompress it "on-the-fly".  A special binary mapping table has been created for the ISLSCP2/GSWP2 grid (60S-90N), and can be used to view gridded versions of your ALMA vector model output with GrADS.
FTP access to GSWP-2 data. (J. Adams, P. Dirmeyer, 20 June 2003; updated 2015)
Data are now being made available by anonymous FTP for those who cannot access any of the servers above because of firewall, compiler, or other issues. Access to the forcing data and fixed fields is at ftp://cola.gmu.edu/gswp/data/.  Access to the Multi-model analysis is at: ftp://cola.gmu.edu/gswp/data/GSWP2_B0/.  GrADS control files that unpack the NetCDF data files are at: ftp://cola.gmu.edu/gswp/ctl/ as is the necessary "pdef" file: ftp://cola.gmu.edu/gswp/ctl/gtd.filepdef.

DODS/GDS:

Building the DODS package (J. Wielgosz, 7 April 2003)
You will need three source tarballs:

ftp://unidata.ucar.edu/pub/dods/DODS-3.2/3.2.1/source/DODS-packages-3.2.5.tar.gz
ftp://unidata.ucar.edu/pub/dods/DODS-3.2/3.2.1/source/DODS-dap-3.2.9.tar.gz
ftp://unidata.ucar.edu/pub/dods/DODS-3.2/3.2.1/source/DODS-nc3-dods-3.2.7.tar.gz

Untar these all in the same directory; then

cd DODS; ./configure; make World

and it will start building everything. Expect to spend some time helping it along if you are building on Alpha, various scripts and sourcefiles will probably need tweaking.

Testing your access to DODS data servers (Z. Guo, 30 April 2003)
We have supplied a simple script and FORTRAN code to test your access to the North American GSWP-2 GDS (with minor modification it can be use to test access to other GDSs as well).  The Unix script and source code are at:

http://cola.gmu.edu/gswp/util/run_test

http://cola.gmu.edu/gswp/util/test_monsoon.f90

The script will delete .dods_cache and build the executable file. When you run the test
program, you need to change the .dods_cache directory and path to the DODS libraries for your case, and type:

  run test_monsoon

to run the test program. The program only reads 9 variables of one month data. So anybody who thinks they are having problems with the server stability can use test_monsoon.f90 to access the DODS server data in their local computer. The source code can be changed to point to the European or Japanese mirror GDSs (URLs not available at the time of this posting).


Diagnostics to help debug DODS problems in your code (J. Adams, 14 April 2003; P. Dirmeyer, 12 May 2003)
If you are having problems accessing the data on the GDS, it would be most helpful to note the following information when reporting your problem to the support personnel:

Periodically deleting the DODS cache to improve stability (Z. Guo, 29 April 2003)
Using the DODS client library generates entries in your home directory: .dodsrc is a facility file, and .dods_cache is a facility directory for DODS. Both of them are implicit files. They can be found in your home directory by typing: 

ls -la

The file .dodsrc cannot be deleted, while .dods_cache can be deleted before or after your code run, but not in the middle of data transfer. The .dods_cache is purported to store some information on your data accessing history
for fast re-access of the DODS data you accessed before. It caches stuff without you knowing about it. This facility can be turned off by setting

USE_CACHE=0

in .dodsrc (see: http://www.unidata.ucar.edu/packages/dods/user/guild-html/guide_71.html). However, it might result in errors when you access a large amount of DODS served data. At the same time, if you leave it alone, and there are too many directories created in .dods_cache, your local cache can get corrupted, your program will end in a "segmentation fault". So the best way is to maintain a reasonable amount information in .dods_cache, and to delete it before or after your program execution.

If your local DODS cache grows very large or is used for multiple sessions, you may experience program crashes. It is advisable to deleted the .dods_cache directory periodically. For example, if you are running your land surface model globally for one month at a time (restarting the model each month from a restart file), you should delete the cache between each month's run. Do not delete the cache while the model (or any other program using the DODS librarles) is still active and has remote files open.

Changing the maximum number of DODS files that you can open (J. Wielgosz, 7 April 2003)
The limit of 32 files open at once is part of the DODS library. To change this limit you need to edit DODS/src/nc3-dods*/lnetcdf/netcdf.h line 830:

#define MAX_NC_OPEN 32

in the DODS distribution, then recompile libdap++.a and libnc-dods.a.


NetCDF


NetCDF utilities that are DODS-compatible (J. Wielgosz, 6 May 2003)
There exist "NetCDF Operators" (http://nco.sourceforge.net) that may be useful for some DODS data access situations. They are a bunch of C utilities that do simple things to netCDF data, like splitting it, averaging it, etc.  In particular, ncks ("netcdf kitchen sink") copies data from one netCDF file to another,  or dumps it as IEEE binary or ASCII (like ncdump). So one can generate local copy of data on a DODS server as simply as this:
    ncks http://dods-url localfile.nc
There are a bunch of command line switches that allow control of subsetting - individual variables, and dimension constraints within the variables - as well as other details.  For example:
    ncks -d time,0,9,2 -d lev,6,10 -v t http://cola8.iges.org:9191/dods/eta/eta2003050612i eta.nc
retrieves variable named "t" for vertical levels 6-10 (whatever those correspond to), at 12 hour intervals, from todays ETA output, and writes it to eta.nc.
Of course, many people who *think* they want to generate local files from DODS, actually just misunderstand the system.  But for those who really do need local files for something, and aren't/don't want to be GrADS users, it might be worth a try.

GrADS and the CF convention of NetCDF (ALMA) (P. Dirmeyer, 12 May 2003)
If you are a GrADS user, soon you will be able to use GrADS to directly view gridded versions of your ALMA vector model output.  Hopefully this new version of GrADS (with many other new features) will be ready this summer.
Conflicts with more than one NetCDF library (Z. Guo, 23 May 2003)
Using NetCDF libraries other than the DODS-NetCDF libraries may cause problems such as unrecognized routines if mixed with the DODS-enabled code.  To ensure compatibility with your site, you may need to create the DODS-NetCDF library yourself from the tar-files on the FAQ page, rather than rely on a library compiled elsewhere.  
NetCDF routines appear undefined (Z. Guo, 27 May 2003)
Library compilation problems may lead to the following problems.

First: routines like nc__create_mp, nc_delete_mp, nc__open_mp, nc_delete are undefined. Downloading and compiling the source codes from the tar-files listed on the FAQ site should avoid this problem (see conflict note above).

Second: nf_open__, nf_enddef__, nf_close__, etc. are undefined. You find those references have an added extra underscore on the end. If that's the case, you need to turn on the '-Df2cFortran' option (change DEFS) in the Makefile (or Makefile.in), and re-build the DODS netcdf library. This should make it work.
Using the F90 NetCDF routines with DODS (Z. Guo, 7 June 2003)
I believe the current available libraries from UCAR Unidata are not linked with Fortran 90 NetCDF interface. So you can only link your Fortran code in F77 with those libraries. If you do want to use F90 NetCDF subroutine calls to access DODS data, you have to link NetCDF F90 interface with the DODS libraries to get the DODS/netCDF libraries, and use that set libraries to link with your Fortran code in F90. Two messages at
http://www.unidata.ucar.edu/projects/coohl/mhonarc/MailArchives/dods-tech/msg01554.html
http://www.unidata.ucar.edu/projects/coohl/mhonarc/MailArchievs/dods-tech/msg01752.html
might be useful.

The NetCDF User's Guide for Fortran 90 is online at http://www.unidata.ucar.edu/packages/netcdf/f90/Documentation/guide.book.pdf
All the interfaces for netcdf/f90 functions can be found there.

Fortran

Sample Code to read and write NetCDF in a Fortran model driver.


Other Errors

IOT Trap error (Z. Guo, 29 April 2003)
The IOT stands for "Index Organized Table".  The "IOT Trap" error message is not specific for DODS or NetCDF or Fortran.  It is usually related to threads or compiler issues. It seems the DODS uses a system call (IOT facility) to organize the .dods_cache. Under certain conditions (.dods_cache is too large to be under control, or some other reasons, I guess), that system call can fail. This problem is hard to be traced or solved.  In my experience, reducing the times of data accessing transaction, and keeping the number of .dods_cache directories to no more than 400 is a practical solution for the "IOT trap" problem.  Note that the subdirectories are automatically created in the .dods_cache directory.  One directory will be created for one data retrieval transaction, so retrieving one month of GSWP forcing data for nine variables will create ~30*9 directories if one day's data for one variable is retrieved per transaction time.