Marine Data Literacy 2.0

Providing instruction for managing, converting, analyzing and displaying oceanographic station data, marine meteorological data, GIS-compatible marine and coastal data or model simulations, and mapped remote sensing imagery





Home > 4. Ocean Data View > 4.13 QC with GLODAP

4.13 Quality Control of Marine Data Collections with the GLODAP Benchmark Dataset

  • Exercise Title:  Quality Control of Marine Data Collections with the GLODAP Benchmark Dataset

  • Abstract:  In this exercise an ODV collection of very carefully quality-controlled ocean station data, the GLODAP dataset, is compared visually to identical scatter graphics [although other modes may be used] to identify possibly bad or questionable data points amidst known "good" data.  Editing existing incorrect quality flags is demonstrated.

  • Preliminary Reading (in OceanTeacher, unless otherwise indicated):

  • Required Software:

  • Other Resources: 

    • ODV collection osd_all_liberia_wod.odv

    • GLODAP Bottle Data - ODV collection constructed from an extremely clean global dataset, collated from cruises also included in WOD; the extraordinary quality control given to these data make them ideal for comparisons with other data.

    • Global Ocean Data Analysis Project - Project website

  • Author:  Murray Brown

  • Version:  August 2010

1.  Take a minute to read over the GLODAP program information above, at the website link.  Especially look for the descriptions of how carefully the data were quality controlled, even to the extent that data on separate cruises were adjust to match exactly at cross-over points.
2.  Use the above link to download the GLODAP ODV collection to your folder PRODUCTS > ODV
3.  Unzip the collection file, and open it in ODV by double-clicking on the *.ODV file (the one with the small colored icon).  [You'll have to navigate into a small folder structure to find it.]
4.  The GLODAP collection open with a strange default view.  We need to open the view to the entire globe.
6.  A partial global map opens.  Right-click on it, and select GLOBAL MAP.
7.  This global map indicates the huge scope of GLODAP, especially considering the fact that so much quality control was performed.

There are about 12 thousand bottle data stations in GLODAP, compared to 2.5 million bottle stations in WOD.  So although there is an overlap, it is not serious enough to defeat the quality control method to be used below.

8.  Right-click on the map, and select STATION SELECTION CRITERIA.
9.  Click on the DOMAIN tab, and enter the outer coordinates of the Liberia project map area.

Then click OK.

10.  This reduces the data down to our project area.
11.  If you want, you can use ZOOM to see the stations. 
12.  Now we can export the Liberia data as a separate small collection, to avoid the huge size of the global GLODAP set.


13.  Navigate to PRODUCTS > ODV and enter the filename osd_liberia_glodap_odv for the collection.
14.  You might as well take all the variables, because it is not a large subset.

Then click OK.

15.  This indicates how many stations were exported.
16.  You can close the global version of GLODAP, because you don't need that huge dataset for the work to follow.
17.  Open your Liberia bottle data collection.
18.  Open your new collection of GLODAP data for the Liberia area.
19.  Use the methods you learned previously to prepare a T-S diagram for the GLODAP collection of data for the Liberia area.  This will take a few minutes, and it's ok to check back to see how you did it in the SCATTER PLOTS exercise.
20.  Before we begin to start comparing values in the two collections, we need to look at the units involved.
  • Salinity and temperature in GLODAP and WOD are immediately comparable
  • Most nutrients in GLODAP have units of umol/kg, but most nutrients in WOD have units of umol/l.  With an error of only about 3% these can be directly compared visually.  For more careful work, perform the exact conversions.
  • Oxygen in GLODAP uses umol/kg, but oxygen in WOD uses ml/l.  A rough conversion factor would be to multiply the WOD values by 45 to get the same value range as the GLODAP data.  We've taken that into account below, where the X-RANGE has been set to 0-450 for GLODAP and 0-10 for WOD.  For more careful work, perform the exact conversions.
21.  When you have the GLODAP Liberia collection T-S diagram, it should look like this (T and S only, no oxygen).
22.  In your Liberia bottle collection from WOD, use VIEW > LOAD VIEW to re-load temp_sal_oxy_liberia_wod_tsdiagram.xview (i.e. the T-S plot)
23.  Here is the T-S diagram for the Liberia bottle collection from WOD.  For simplicity the Z-VARIABLE on this plot has been changed from OXYGEN to NONE.

Obviously, there are huge differences between these two figures, reflecting the realities of marine data collections.  Many small and some large errors are included, and also the very carefully controlled GLODAP data might not actually capture the complete picture of natural variability.

We can use the GLODAP data as "benchmark" dataset to investigate the apparent differences between these collections, and the patterns in those differences.

24.  Here's the GLODAP scatter plot for temperature.
25.  And here's the WOD data, using the same scales.  The close similarities of the figures is quite reassuring, considering the fact that temperature data are much old than all other data, and so many methods/instruments have been used.

We can see some obvious issues, however.

26.  For example, this profile seems completely different in shape, and internally the structure is consistent.

This could be a profile from a different area of the world??  This sort of mistake is not unknown.

27.  Here's the GLODAP salinity data.
28.  And here's the WOD salinity data.  The biggest differences are in the very large area of lower salinities at depths less than 1000 m, as well as a general "scattering" at all depths.
29.  Using the cursor to investigate individual low values in the area, it should be noted that many of them are "spike" low values in profiles that otherwise seem ok.
30.  Here's the silicate data from GLODAP (chosen as an example of the ordinary main nutrients).
31.  And here's the WOD silicate data.  Of note is the large amount of high values above 1000 m.
32.  Checking in this area with the cursor, you'll find a number of stations that appear to provide uniformly high profile values, rather than single erratic points.  These profiles are both deep and shallow, it's not just a coastal-versus-deep issue.
33.  Here's the GLODAP oxygen data.
34.  And here's the WOD oxygen.  The same general patterns, with lots of scatter, both deep and shallow.
35.  Checking some of the obviously questionable values with the cursor reveals one or just a few deep stations in profiles that otherwise appear normal.
36.  And so it goes.  You can compare values from all the different variable found in GLODAP with the WOD, with a view toward identifying some trends in apparent differences.
37.  Now, with the cursor still on that apparently bad value in Panel 35, take a look at the station/sample information window on the right side of the ODV screen.

The bad value of 3.68 ml/l is at depth 2975, and recall that the two value immediately above it also looked bad.

38.  Right-click on the oxygen value, and select EDIT DATA.
39.  This EDIT DATA window opens.  Block the bad values (2875 m, 2755 m and 2400 m).


40.  Set the value to 3, and click OK
41.  Check to see that the flag was changed for all 3 bad values.

Then click OK.

42.  The points are still visible.
43.  Right-click on the graphic and select REDRAW.
44.  The newly-identified bad points are not shown now (due to our quality settings in SAMPLE SELECTION CRITERIA).
45.  This introduces a general method to investigate quality in a new collection, and how to flag data points that don't appear to be good, or are at least questionable.  You should investigate other possible "benchmark" datasets for use in the same way.