Marine Data Literacy 2.0

Providing instruction for managing, converting, analyzing and displaying oceanographic station data, marine meteorological data, GIS-compatible marine and coastal data or model simulations, and mapped remote sensing imagery





Home > 3. Principal Archives > 3.9 PANGAEA

3.9  Managing PANGAEA Research Data, with Visualization Methods for Ocean Data View

  • Exercise Title:  Managing PANGAEA Research Data, with Visualization Methods for Ocean Data View

  • Abstract:  In this exercise you'll find and explore the PANGAEA database (which should still be called the World Database of Environmental Research Data).  Due to massive effort by the publishers, it contains thousands of research measurements, carefully digitized into standard formats, and fully indexed by robust metadata.  After finding and obtaining data from a very easy graphical interface, you can use a special application provided by PANGAEA to convert the data to the Ocean Data View (ODV) software format.  The conversion is so easy that once done the new file opens immediately as an ODV collection.  This allows users to pursue and gather needed parameters not typically found in existing world datasets.

  • Preliminary Reading (in OceanTeacher, unless otherwise indicated):

    • N/A

  • Required Software:

    • Microsoft Visual C++ 2010 Redistributable Packages - 64- or 32-bit auxiliary programs to support Pan2Applic

    • Pan2Applic - Special program to convert single or aggregated PANGAEA files to Ocean Data View format

  • Other Resources: 

    • PANGAEA - "An Open Access library aimed at archiving, publishing and distributing georeferenced data from earth system research." [Website].  In basic terms, the publishers have undertaken the monumental (and admirable) task of digging into thousands of scientific research publications to find the original data tables from in situ sampling and measurements, and have organized them into their own standard format.

    • PANGAEA Wiki

    • Digital Object Identifiier (DOI) - Permanent character string (a "digital identifier") used to uniquely identify an object such as an electronic document (or more recently a datafile).

  • Author:  Murray Brown

  • Version:  4-1-2014

1.  Carefully install the Pan2Applic software on your computer.  The website pages about installation are hopelessly complex, so just follow the directions on the MDL software page (above link).
2.  Open the main PANGAEA webpage, and take some time to read about this amazing project.  In essence, they are working to retrieve and digitize huge amounts of environmental data that has gone into research, but never published formally for use by the world.

You can see from the top-level categories how big this job will be.  It is to the credit of the authors that they have done a magnificent job so far in bringing tons of data back from the grave.  Click in the ADVANCED SEARCH control to continue.


3. In the SEARCH TERMS part of the search page,  leave the ENVIRONMENT set to ALL, then you sometimes get an incomplete search listing, so make sure to select WATER.  For PARAMETER we've specified primary productivity, and extremely important variable that is not included in the World Ocean Database yet.

HELP - Click here any time to see search options.  There does not appear to be a data dictionary in play here, so be flexible and creative in your searches.

4.  You can explore the temporal search function later on your own.  For now, just leave the spaces empty.
5.  In the GEOGRAPHIC COVERAGE part of the search page, you have several options for selecting an area.
6.  We'll take the easiest route, and simply enter the limits of the Liberia area (carefully including the negative signs where needed).  Then click on the "compass" symbol in the midst of the 4 geographic limits.  This zooms you into the area, and draws a nice rectangle.
7.  Click on SEARCH (either top or bottom of the page), and after a short wait you'll get a report like this if there are any corresponding data.  We have access now to 24 datasets.

There is an interesting message that tell us how PANGAEA breaks all multi-word search tokens into separate parts, and uses them as if they included an "OR" token.

Take the time to scan through the 24 listings, just to get an idea of the types of original sources.  Many (most?) of them do not have "primary" or "productivity" in the titles, so considerable examination and careful indexing has been performed by the PANGAEA folks.

8.  At the very top-right of the report, click on SHOW MAP.  This really well-done chart appears.  Take the time to read about the symbols used (lower-right corner) and click on the objects to see where they came from.

9.  For an example, click on one of the small row of sites along the Liberia coast, to see the title of the data..  Morel, Behrenfeld and Falkowski are some of the greatest names in ocean productivity, so these are probably very fine data.  But these data points are for remote sensing data, which we don't need right now.
10.  Click on one of the aggregate symbols, and you'll see this report from in situ measurements.  Note that there is a DOI number for these data.
11.  Click on the DOI to bring up the full text of its metadata.  This may be the longest list of authors we've ever seen, but this is perfectly OK due to the collegial nature of complex ocean survey measurements.

12.  Look closely at the very bottom left corner, and you'll find a DOWNLOAD DATASET link.  Click this link to obtain these data as a tab-delimited, ASCII text file.
13.  Save the file in the folder DATA > OCEAN > PANGAEA > WORK with the filename  That's the original filename from PANGAEA plus a little more added by this author from the DOI number itself.

NOTE:  Everything in the WORK folder will be merged and analyzed below, so make sure you have removed any data files from previous PANGAEA work.

14.  Go back to the map, and find another aggregate data symbol nearby, from the same research group, and save it in the same way.
15.  While you are saving it, make sure that the data report shows the variable you really want.  [You can ignore the others.]  This list does include primary productivity, so we should go ahead and get these data.
16.  Save these data in the folder DATA > OCEAN > PANGAEA > WORK with the filename  This is also the author's invention, based on the original PANGAEA filename and the DOI.
17.  Now we're ready to merge these data and convert them to Ocean Data View format.  If you read through the Pan2Applic instructions, you'll find a number of different routes from PANGAEA to ODV, different from the method shown below.  Please explore them all and select the method that works best for you.
18.  Run Pan2Applic.  Select FILE > SELECT FOLDER.
19.  Navigate to your WORK folder, and select CHOOSE to make sure this is where Pan2Applic will do its job.
20.  Make sure you can see your files there.
21.  Go back to the main menu of Pan2Applic and select CONVERT > OCEAN DATA VIEW.

NOTE:  You have other choices of interest, including shapefiles.  Remember that if you need to add station locations to your own project map.

22.  This rather frightening page opens.  By trial-and-error (my favorite method) this author has gotten excellent results with the settings you see here.  You are invited to contact me and set me straight if you see a better way.
  • Make sure that for OUTPUT FILE, you use BROWSE control to set the location to the DATA > OCEAN > PANGAEA folder.
  • Make sure that for PROGRAM, you use the BROWSE control to locate the ODV4.EXE file in your most current version of Ocean Data View.
  • Make sure START OCEAN DATA VIEW is checked.

Click OK, and the conversion process will begin.

23.  Just click OK.

24.  Now you can see that PANGAEA has gone through both datafiles, and combined their field structures into a single list.  The upper items (indicated by 1:) are from the metadata for the tables, holding information about the circumstances of the collection.  The lower items (indicated by 2:) are the actual data value labels.

23.  Use Ctrl-Click to select the data items shown here.  Then use the ==> control to move them to the target area on the right.  Very obviously this is oversimplifying some matters that you should investigate on your own later.  For now, click on OK to create the combined new table file WORK.ODV.TXT.

24.  Words to the wise.  We know from experience that Pan2Applic will aggregate all the station data into a single data file with the name WORK.ODV.TXT, and into an ODV collection with the name WORK.ODV, so you might want to add the AREA, CAMPAIGN or EVENT fields when you are working with your own data archeology, so that the resulting collection can be examined more critically than you can do with a simple 1-file aggregation.  In fact, you'll probably need all the data, eventually, but the author is just taking the shortest route here.
25.  Amazingly, because we checked it above, ODV immediately opens with the new data table already imported into a new collection, named as above.  This is entirely familiar territory to MDL students, so you can take it from here.
26.  Here's a map of the new collection's stations, from ODV's map mode.
27.  And here's a scatter plot of the primary productivity values.  The pattern is entirely normal looking and clean.
28.  To understand better exactly what happens during this process, here are separate collections created from DOI 765636 (above) and DOI 684809 (below).  They both contain surface values, but only the upper collection has values at depth.  The total number of samples is coincidentally 48 in both cases,

29.  Pan2Applic typically makes the new ODV collection in the folder just above the WORK folder, with a very generic name.  You might want to take a minute in ODV to use the command COLLECTION > RENAME to give it a more useful name, similar to the filenames used above.
30.  PANGAEA is an immensely valuable resource that will tax your ability to keep up with the new possibilities it offers.  You should examine its existing data very carefully before you spend time and money looking at things that may already be "out there."  We owe a huge debt of gratitude to the gnomes, elves and graduate students who have done such a magnificent job.