Marine Data Literacy 2.0

Providing instruction for managing, converting, analyzing and displaying oceanographic station data, marine meteorological data, GIS-compatible marine and coastal data or model simulations, and mapped remote sensing imagery





Home > 4. Ocean Data View > 4.12 Range Checking

4.12 Quality Control of Marine Data Collections with Value Range Checking in ODV

  • Exercise Title:  Quality Control of Marine Data Collections with Value Ranges Checking in Ocean Data View (ODV)

  • Abstract:  There are global and a few regional value range tables, compiled to give data managers an idea of the "reasonable" values to expect for certain variables.  In the best cases, these tables take depth and seasonality (or even months!) into account.  ODV allows users to automatically compare these published ranges with the data values in the collection.  The user is instructed in methods to set up these comparisons.

  • Preliminary Reading (in OceanTeacher, unless otherwise indicated):

  • Required Software:

  • Other Resources: 

    • ODV collection osd_all_liberia_wod.odv

  • Author:  Murray Brown

  • Version:  January 2016

1.  This is approximately what you should see after the previous exercise.  [You may have to relax the current STATION and SAMPLE criteria to make sure you are seeing all the GOOD DATA.]
2.  Now let's make sure that we are viewing only the GOOD DATA.  Right-click on the small station map, select SAMPLE SELECTION CRITERIA.
3.  Make sure that Flag 1 is selected.  Click APPLY TO ALL VARIABLES.  Then click OK.
5.  Review the basic value range information for the Atlantic Ocean in the resource article above.
6.  For DEPTH, set 0 to 7000, which looks reasonable from the GEBCO bathymetric contours data.
7.  For TEMPERATURE, set -3 to 35
8.  For SALINITY set 20 to 40.
9.  For OXYGEN, set 0.01 to 12
10.  The next 3 variables are listed with units umol/l in ODV, but they have units of umol/kg in the tables.  There is only about a 3% error in using the same values, and for a basic demonstration this should be OK.  In real work, you might want to convert the table values to ODV units exactly.
10.  For PHOSPHATE, set 0 to 3.6.
11.  For SILICATE set 0 to 160
12.  For NITRATE set 0.01 to 48
13.  We don't have any good range values for NITRITE yet.  Skip this one.
14.  For pH set 7 to 9.1
15.  For CHLOROPHYLL set 0 to 64 (same units as ug/dm3 in the table)

NOTE:  64 is actually very high, compared to the usual range of 0-4 found in most open ocean areas.  You might want to go back and use this smaller range for a second check.

Now, click OK to begin the checking process.

16.  This window opens, indicating a list is available, and it can be saved.  You can change the name, or simply use the default name, then click SAVE.
17.  This window indicates that only a very small number of outliers were found.

Check both the boxes, and then click OK.

18.  This is the file OUTLIERS.LST opened in Notepad.  All 28 outliers are due to SALINITY problems.
19.  This is the OUTLIER ACTION window, where you can go through the values and make choices about them.

On your own time, you can go through these and KEEP the same flags, or APPLY changes, as you feel appropriate.

20.  What does it mean if you get no data in the OUTLIERS.LST above?
  • A miracle has happened, and all the bad data have already been flagged as bad.  Sometimes this happens, especially if the data in question have already been examined/manipulated by others
  • Something is not working in the search algorithm.  To see if this might be the case, manually edit a single station record value to lie outside the desired range, and run the tool again to see if the "bad data" are now identified in the list. 
    • Make sure you can easily find this record again to change it back afterward.  Or, alternately, you could make this check on a temporary collection copy that you delete afterwards.
21.  This is a very simplified introduction to the quality control process, using value ranges.  These ranges are so gross, being taken from ocean-scale statistics, that they do not adequately identify all problems.  In any particular project area, you should develop much tighter ranges for your own data, and for various seasons, depths, etc.