Data analysis and interpretation

Basic Statistics in Excel
Measures of Central Tendencies and Dispersion
We will be using the insert formula function (fx) to calculate statistical parameters. Click on any empty cell and then point your cursor to fx and then click. The window below will appear:


Select the function that you want to calculate from the insert menu. A brief description of the selected function appears at the bottom of the function window.
These are some of the basic statistical parameters used in data analysis. You can directly type the following command to calculate the parameters.

=SUM(cell:range), to obtain the sum or total
=MIN(cell:range), to obtain the minimum
=MAX(cell:range), to obtain the maximum
=AVERAGE(cell:range), to obtain average or mean
=MEDIAN(cell:range), to obtain the median
=MODE(cell:range), to obtain the most frequently occurring value
=STDEVA(cell:range), to obtain standard error
=CORREL(cell:range,cell:range), to obtain correlation
Data analysis for T-test
T tests can be used to compare two groups or treatments. Follow the instructions below:

  • Copy and paste the data in excel
  • Click on any empty cell.
  • Type = sign in the empty cell.
  • Type TTEST (=TTEST(array1,array2,tails,test)
  • Type ( , open parenthesis
  • If "TTEST" is not on the list, click "more functions", choose "statistical", then "TTEST".
  • Highlight the data for the first treatment or group as the "Array 1"
  • Type , (comma)
  • Highlight the data for the second treatment or group as the “Array 2”
  • Type , (comma)
  • Type 1 if group or treatment is predicted to be higher or lower than the other group or treatment OR
  • Type 2 if you did not predict one group to be higher or lower than the other group
  • Type , (comma)
  • Type 3 for unequal variances or 1 and 2 for paired and unpaired t test, respectively
  • Type ), close parenthesis
  • Hit Enter
  • The number indicated on the cell is the p-value

This p-value indicates whether the means are significantly different or not. A p-value       ≤  0.05 is considered statistically significant, otherwise if the p-value > 0.05 indicates no difference between the groups or treatments.

Graphing with Excel

We will be using sample data on dissolved oxygen and water temperature as water quality parameters. Dissolved oxygen in water and temperature of that water have inverse relationship. As water temperature increases, the concentration of dissolved oxygen decreases. Bubbles of oxygen can be seen streaming out of water when it is heated. The ability of water to uptake greater concentration of dissolved oxygen increases as water cools. Other factors such as agitation can also increase the concentration and speed of uptake of oxygen into water. These relationships are important to living things and the ecosystems in which they live.

We will be using different types of graphing techniques that will help characterize the data that we will collect in the field.

Entering and Formatting the Data in Excel

Open excel and begin by formatting the spreadsheet cells. We will format the cells that will contain the values for the date column.

  • Click and drag over the range of cell in the date column
  • Click format, cells
  • Click the text tab and then click OK
  • You are now ready to type your data


Creating the Line Plot

  • Highlight the data that you want to graph
  • Start the chart wizard (Choose the Chart wizard icon from the tool bar)

  • The first window of the wizard comes up
  • Select line plot, then click next

  • Select series, change the Name series1 to “Dissolved Oxygen”
  • Add another series by clicking the add button
  • Label the name as “Temperature”


  • Click on values and highlight the data on the dissolved oxygen column (do not highlight the column title)

 

  • After highlighting the data on the DO column, click the icon (with a red circle) to go back to the main window, click next to label the plot
  • Label the Chart title, category (X) axis and value (Y) axis. Do not forget to include the unit of measurements for both X and Y axes
  • Click next and then finish

 

 

Creating scatter plot and regression analysis

Regression analysis is used to determine the relationship between two variables. In our example, we will use temperature as a predictor (Y) of dissolved oxygen (X). If there is a strong relationship between the 2 variables, we will then be able to predict dissolve oxygen from water temperature measurements.

  • Highlight the data (both columns containing the data on Temperature and DO)
  • Choose Chart Wizard icon from the tool bar menu
  • Choose XY (Scatter) and the unconnected point icon for the chart subtype
  • Click next



  • Label the scatter plot diagram

  • Delete series legend since we do not need to show this in our final scatter plot
  • Right click on any of those data points on the scatter plot (all the points turn yellow which indicates that all them are highlighted)

  • Select add Trendline on the menu, then select options
  • Check Display equation on chart and Display r-squared value on chart



                                  

 


Having a very high r2 (0.9686) means that temperature and dissolved oxygen are highly related. Hence, we can use the regression equation:

Dissolve Oxygen = 14 + (0.3929 x Temperature)

If we have to measure only the temperature, we can use the above equation to predict the amount of dissolve oxygen of the water.

Creating the Bar Graph

Bar graphs are created using the same methods used in line and scatter plots.

  • Highlight the data that you want to graph
  • Click on the Chart Wizard icon, select Column graph and the first graph as subtype, click next


  • Label series1 as Temperature
  • Add series2 and label as Dissolved Oxygen; click values icon (indicated by red circle)


  • Highlight the dissolved oxygen column containing the data, click next and label the graph.



Background
Introduction
 
Modules
Module 1
Module 2
Module 3
* Table 2
Module 4
Module 5
 
Tutorials
AEJEE Tutorial
Data Analaysis and Interpretation
 
Data Worksheets
Tables and figures in Excel
 
Conclusion
Summary and Conclusion