8.7 Data Browsing/Mining in SOS Servers

  • Exercise Title:  Data Browsing/Mining in Sensor Observation Service (SOS) Servers

  • Abstract:  SOS servers (see resource material below) have been established to allow the user to connect directly to in-situ, active measurement devices (including remotely placed instruments).  This technology was designed within the OGC family of functionalities with a view toward bridging the gap between operational earth sciences and geospatial methods.  Methods to access and obtain data through known SOS sites are presented here.  Uncertainty surrounds the future of SOS, which is not being widely used yet.  Many initial test sites are apparently now inoperable.

  • Author:  Murray Brown

  • Version:  July 2013

1.  Install EDC (latest version; and check back for new versions every time you use EDC)
2.  Open the edcconfig.xml file in your C:\EDC folder, and change <CLOSE_AFTER_PROCESSING> to false
3.  Run EDC.
4.  You'll find these 3 tabs at the beginning:
  • BROWSE - Locate and open a SOS server
  • DATA VIEWER - Infrequently used by SOS; simple raster viewer
  • LOG - Status of command operations


5.  At the bottom of this exercise you'll find a small list of SOS servers; most are known to be in operation now, but that is not guaranteed.  The only "global" source is the US National Data Buoy Center (NDBC) which is responsible for many international site.
6.  Copy the NDBC link and paste it into the space by SOS.
7.  Then click CONNECT (on the far right).
8.  The LOG tab automatically opens, and you'll see a message like this.

Keep watching until you see PROCESSING COMPLETE, or an error message.

08:32:06 Building Request
08:32:06 Reading SOS URL:
08:32:13 Fetched GetCapabilities in 6.984 seconds
08:32:13 Size of SOS capabilities request (mB)= ~1.70023
08:32:13 Parsing for Offerings...
08:32:15 Seconds to parse SOS capabilities: 1.531
08:32:15 Found 838 valid sensors.
08:32:15 Parsing out unique variables
08:32:15 Found 9 unique variables in 0.0 seconds
08:32:15 Adding sensors to map (this could take awhile)
08:32:15 Processing of SOS Server complete

9.  Now a new tab has appears, SOS-SUBSET & PROCESS, and the page is filled with five new large items (discussed next).
10.  On the left are the SENSORS, a list of all measurement sites covered by this SOS server.  You could look in this list for a desired location and check it.
11.  In the middle is a map, currently in 3-D mode.  Click on 2D just below it, and you'll see this flat map of the 838 stations.  You can also click directly on any station dot to see information about it, and to select it for data download.
12.  On the right you'll find this list of the nine available variables.  [Beside each variable name is a link to that term in the MMI Ontology Registry and Repository (ORR), a web application through which you can create, update, access, and map ontologies and their terms.]

NOTE:  Although it is not listed here, just about all these stations also have SEA LEVEL.  The author doesn't know why EDC has not picked this up, and has asked for clarification.

13.  In the lower-left corner is a typical geographic location tool, where you could enter the coordinates of an area of interest.
14.  Occupying most of the bottom of the display is a calendar tool where you can select the times of interest.  This data collection goes back 23 years!  The default selection seems to be the last 10 days.

15.  Now, let's click the SELECT BY BBOX tool, and drag a rectangle around some stations in the eastern Indian Ocean.
16.  EDC automatically puts checks by the station items in the left-hand list, so you don't have to do it.
17.  We'll accept the time range of the last 10 days, and we'll request all the variables, so just click SELECT ALL there.
18.  When TIMES, STATIONS and VARIABLES have been selected, a GET OBSERVATIONS control appears.  Click it.
19.  This page appears.  There's really nothing to do here but click START.
20.  The LOG page opens again, and you can watch the progress of your request.  The most important parts are the final lines.  They show that EDC results are saved in a folder named EDC\OUTPUT.  This is followed by:
  • SOS Server Name - NDBC in this case
  • Date and Time - Very similar to ISO format
  • Filename.CSV - Where filename is the station ID; CSV means it is an ASCII comma-separated variables format, very easy to read in most programs

This very organized method will let you always get back to these data later on, without difficulty.

Loading DIF to ERSI CSV XSL Schema from /resources/schemas/ioos_gmlv061_to_arc.xsl
[long log text deleted for bravity]
Saved Files:
- C:\EDC\output\sdf_ndbc_noaa_gov\2013-07-31_0915AM\station_23227.csv
- C:\EDC\output\sdf_ndbc_noaa_gov\2013-07-31_0915AM\station_23228.csv
- C:\EDC\output\sdf_ndbc_noaa_gov\2013-07-31_0915AM\station_23401.csv
- C:\EDC\output\sdf_ndbc_noaa_gov\2013-07-31_0915AM\station_28906.csv
- C:\EDC\output\sdf_ndbc_noaa_gov\2013-07-31_0915AM\station_56001.csv
- C:\EDC\output\sdf_ndbc_noaa_gov\2013-07-31_0915AM\station_56003.csv
- C:\EDC\output\sdf_ndbc_noaa_gov\2013-07-31_0915AM\station_58951.csv

21.  You can check on the results, by examining the folders in Windows Explorer, as you see here.  Actually each CSV file has an XML file twin (contains the same data, but in a rather awkward XML format).

Stations without any data can be recognized immediately due to their size, 1 KB.

22.  What you do with these data is up to you, but here's the first CSV file in the list, to show you what you have.  Notice that the only data item here is WATER LEVEL, which is strangely not in the list of VARIABLES in Panel 12.
23.  That concludes this short exercise, provided in MDL just to show you a new technology that doesn't offer a lot of data yet.  The burden is on you to monitor the situation and see if any projects of interest to you might have SOS servers online.  Then of course the further burden on you is to find out what the data contain and how they can be used.

24. Selected SOS Servers - Web searches for SOS servers return dozens of site addresses, but most are not in operation.