Marine Data Literacy 2.0

Providing instruction for managing, converting, analyzing and displaying oceanographic station data, marine meteorological data, GIS-compatible marine and coastal data or model simulations, and mapped remote sensing imagery

 

 

 

 

Home > Gridded Data > 5.17 IDV Histogram

5.17 Creating a Histogram in Excel for IDV Grids

  • Exercise Title:  Creating a Frequency Distribution Graph (Histogram) in Excel for Integrated Data Viewer (IDV) Grids

  • Abstract:  Until IDV gets its own histogram tool, this method works very quickly and easily, and can be run alongside IDV with no trouble.

  • Preliminary Reading (in OceanTeacher, unless otherwise indicated):

    • N/A

  • Required Software:

    • Excel - Must have the DATA ANALYSIS add-in installed.  Just run Excel and go to FILE > OPTIONS > ADD-INS.

  • Other Resources: 

  • Author:  Murray Brown

  • Version:  9-8-2014 (re-revised after much coffee)

1.  This is the Excel spreadsheet you can export from IDV for any raster product.  Note that it is essentially an extended XYZ table, with separate columns for each time step.

2.  We can't work with the #NUM! values, so use Excel's FIND AND REPLACE function to replace every occurrence with nothing, i.e. delete them all.
3.  This message indicates successful deletion.  The table is ready for use.
4.  This is the spreadsheet with the no-data values removed. We want to make a frequency distribution (i.e. a histogram) for any of the data columns.  We don't have time to study them to find out the value ranges, so we need a quick trick to see them automatically.
5.  Find the DATA ANALYSIS control.  If you can't find it, them you must install the ADD IN.
6.  Select the HISTOGRAM tool, and click OK.
7.  For this example, let's make a histogram for the E column of data values.  Select the INPUT RANGE control (the tiny spreadsheet icon to its right), and then select the E column.  Excel will automatically show $E:$E to indicate this column.

Then check these:

  • LABELS
  • CHART OUTPUT

Then don't do anything else, but click OK.

8.  Here you see a "quick and dirty" histogram for the whole dataset, made automatically by Excel.  You can easily see that the value range should be 33-36 (or similar).
9.  Now, in the real world, you'll need some value ranges, arranged into evenly spaced "bins" for your data.  Each type of data, e.g. salinity, temperature, will have at least one favorite value range, based on the area where you work, the seasons, etc.  We provide you a generic salinity value range set in he next panel, where you can copy and use the values you need to make a really good histogram in Excel.

The author has performed some experiments with Excel, and discovered how the bin values work with the HISTOGRAM data tool.  You'll need to do work on your own later to derive your own bin value scales.

10.  How distribution bin values (in bold) work with the data value ("x") distributions in Excel:
  • 0:  x = 0
  • 1:  0 < x <= 1, centered on 0.5
  • 2:  1 < x <= 2, centered on 1.5
  • 3:  2 < x <= 3, centered on 2.5
  • etc.

So this means that the bin value is the maximum value of the data in that bin.  The minimum value of data in that bin depends on the next lower bin value.

11.  There is a very easy way to create a good set of bin values in Excel, which we'll use below.  But you must take responsibility for the selection of the values and for defending your selections scientifically.
12.  Decide on the bin values you want to use for your final histogram, based on the "quick and dirty" histogram.  For example, in the above example, we might chose these values:
  • 32
  • 32.5
  • 33
  • 33.5
  • 34
  • 34.5
  • 35
  • 35.5
  • 36
  • 36.5
  • 37
13. Make a new column anywhere in your data spreadsheet and type the first few bin values into it (starting with the lowest).

 

14.  Now Excel can finish the list for you.  Grab the tiny handle at the bottom right corner of the selected area and slowly drag it down.  Check the values that Excel automatically adds, until you reach the limit you chose (in this case, 37).
15.  Now that you have a set of bin values, you can run the histogram tool again.  But this time you should specify the column for the BIN RANGE.  Excel "knows" exactly what you're doing, and is quite flexible in how you make this designation.

Click OK to make the analysis, and be prepared to wait a few minutes for all the calculations to finish.

 

16.  This results in a nice clean histogram with easily understood abscissa values.

NOTE:  The graphic is extremely easy to edit, as you see here.

17.  Look at the bottom of the Excel spreadsheet to see the tab structure.  SHEET0 is the original dataset, with the bin value column added.  The other SHEETS are different experiments you have performed, so you can go back and see everything again.

As usual, you can save the whole set of spreadsheets as a workbook in Excel.

18.  You can set up the workbook however you like for quickly adding data and making the histograms.  And you'll probably want to make a "canonical" set of bin values for use with commonly displayed parameters.