CCBC Resources
Join CCBCWEB.NETHome Page
The Biotechnology Curriculum
Collection of the California
Community Colleges

Graphing Scientific Data

Bio 211, Mr. Hoyt
Southwestern College
Chula Vista, CA,USA

Introduction

Why graph data? Scientists must be able to communicate data to other scientists. Often, the most concise way to do this is to present the data in the form of a graph. Therefore, when reading a scientific paper, usually the first thing a scientist looks at is the graph.

Additionally, graphing can allow transformation of data in ways that can demonstrate aspects of the data more clearly and meaningfully. You will do both of these in this exercise.


Learning Objectives

Upon completion of this lab you should be able to:
 

1. Define
 
dependent variable
independent variable
scatter graph
best fit line
 line graph
extrapolation
slope
column graph
 distribution frequency 
 
2. Make a well constructed and labeled scatter graph with a best-fit line, line graph and column graph.

3. Be able to use Cricketgraph™ or other graphics package to construct a graph.

4.Be able to verbally describe a graph of simple data.

5. Be able to calculate and interpret slope.


What Makes a Well Constructed Graph?
 

1. On hand drawn graphs, always use graph paper for accuracy or use graphing software such as Cricketgraph™
2. A descriptive title is important. For example, you want to graph shoot length vs. time in an experiment in which you manipulated the amount of light bean plants received each day. You construct a nice graph and thenname it: Shoot Length.
What is the problem with this name? Well, it doesn’t describe the experiment. Someone looking at the graph will have to read the full experiment before understanding the graph and that takes a lot of time. A much better title would have been " Shoot Length in Bean Plants with Various Day Lengths". Now there is no question as to what the graph represents.

3. Scale and Label your axes accurately. Space the intervals evenly along the axes and use a maximum value on each axis which is only slightly greater that the greatest data value for that axis. This gives your graph the greatest sensitivity and doesn’t waste space. Label your axes.

4. Put independent variable (usually time) on the horizontal or "X" axis. The independent variable is usually a scale that measures the progress of the experiment. The dependent variable is usually placed on the vertical or "Y" axis. The dependent variable is usually what is being measured as the experiment proceeds. 

5. Do not make up data. Plot data points only at coordinates where data was collected. 

6. Extrapolate data only when called for. Extrapolation is an assumption made when a trend in an experiment is clear. For example, if a line on a graph extends upward at a 48 degree angle and does not deviate from that angle, you would probably be correct in assuming that the line will continue on at 48 degrees. Say that you need information about the experiment at a certain coordinate that is beyond the last data point collected in the experiment. You could extend the line on the graph at the same 48-degree angle to the coordinate that you need, assuming that that would be the results of the experiment at that coordinate. Realize that this may not be a good assumption. 

Types of Graphs
 
1. The scatter graph 

The scatter graph is a simple graph of data points between two axes. See Figure 1 for an example. Usually the independent variable is placed on the horizontal axis and the dependent variable is placed on the vertical axis. 

Figure 1: scatter graph scatter graph As you can see, the data is hard to interpret in the graph in Figure 1 due to the scattering of the data points. To help interpret the data in a scatter graph, a line is often used to show trends in the data. Often the most useful line to show trends in data in a scatter graph is a best-fit line (also known as a best fit curve). A best fit line is a line on a graph which best represents the trend in a set of data points when the data points themselves have considerable variability in value. This variability may be due to minor inconsistencies in measurement or factors inherent in the experiment itself. See Figure 2 for a graph of the same data in Figure 1 but with a best fit line.   The calculation of a best-fit line is complex but to keep it simple, it is a method of averaging of the data points to give a line which best depicts the data. Cricketgraph™ can easily do it for you.   Figure 2: scatter graph with a best-fit line

scatter graph with a best-fit line.

2. The line graph

There are times when using a best-fit line on a scatter graph is not very useful. Often a best-fit line, since it is an average, does not accurately represent what is really happening. For example, if you graph the number of sunspots (eruptions on the surface of the sun) that appear over a period of time, it will not be well represented by a scatter graph with a best-fit line. The reason is the number of sunspots is known to oscillate in 11-year cycles. See Figure 3. Does this graph represent the data well? 

What the best fit line in Figure 3 really shows is an average of the number of sun spots over roughly a 120 year period. This doesn’t tell you much about the actual number of sunspots in any one year. Now look at Figure 4. This is a graph of the same data treated as a line graph (i.e. connect the dots). I think you will agree that the line graph in Figure 4 is more representative of the data because it makes it easy to see what happened from year to year.

 
Figure 3: scatter graph with a best-fit line

scatter graph with a best-fit line.

Figure 4: line graph 

line graph


3. Calculation of Slope
Graphical data can be manipulated in many different ways that can yield useful information. One way that a scatter graph with best-fit line or a line graph can be useful is in the determination of slope. Slope is a way of expressing the rate at which the experimental data is changing.

  Slope is defined as the "rise over the run" or numerically,
m= y2 - y1/ t2 - t1
where m= slope
y1 = a data point at time 1
y2 = a data point at time 2
t1 = time 1 and t2 = time 2 
This gives you the rate at which your data is changing (an example could be how fast an enzyme is making a product). The slope can be calculated for the whole experiment or for a particular time frame in the experiment.
For example, return to Figure 1. How do you calculate the slope between age 4 and age 14? Well t1 = 4 years and t2 = 14 years. y1 = the data point at time 1 which is 0 points, and y2 = the data point at time 2 which is about 95 points. Plug the numbers into the slope equation and you should get 
m= 95 points - 0 points / 14 years - 4 years
m = 9.5 points per year 
This is the average rate at which the math score should increase per year. However, you can see that the increase in score is not the same for each year. What if you want to determine the rate of increase of the score between 12 and 14 years old? Slope can do that too because slope can be applied to a small portion of the graph. Just make t2 = 
14 years and t1 = 12 years. Make y2 = 95 points and y1 = about 60 points. If you got about 17 points per year, go on to the column graph!  
4. The Column Graph
 
This is a type of graph that shows frequency distributions well. A frequency distribution measures how many of a certain thing falls into a specific category. For example, you are interested in the age of the Pacific Yew Trees in a forest. This tree is important as a source of anticancer drugs. You go out into the forest and take core samples of many yew trees. Then you count the rings in the core samples to give you the age of each tree. Now you are ready to construct a column graph. 

Construction of a Column Graph

1) Divide your data into convenient categories. E.g. First age category: 0 to 50yrs, Second age category: 51 to 100yrs, Third: 101 to 150yrs, etc.
2) Draw out the axes of a graph with the age categories on the "X" axis. In this case, age is the independent variable.

3) Label the "Y" axis as "# of individuals" and scale the axis appropriately. Now count the number of individuals in each age category and draw a bar to that height on the "Y" axis. As this is what is being measured, it is the dependant variable. 

4) Don’t forget a descriptive title. For an example of a column graph, see Figure 5.

Figure 5: column graph

column graph

What you have produced is a graph with age categories and in each of these age categories is the number of yew trees of that age. You have created a graph of the frequency distribution of the ages of the yew trees in that forest. It is now very easy to see which age is the most common age of the yew trees in the forest you surveyed. 

Now you are ready to tackle some graphs on you own! 


Graphing Assignments

Work individually but you may discuss methods with your fellow students. Construct good and complete graphs with data provided. Use descriptive labels and titles! You may be called on to place your graphs on the board for discussion.

Assignment 1
 

a. Plot the first plant’s root grow as a line graph using the data provided. Make the graph by hand in the manner described in the first few pages of this exercise.

b. For the first plant you should calculate slope to determine the growth rate over the whole experiment and for the time interval between day 8 and day 12. 

  c. Plot the second plant’s root growth on the same graph. This makes it easy to compare data from plant 1 to plant 2. For the second plant you should also include slope calculations to determine the growth rate over the whole experiment and for the time interval between day 8 and day 12. 

d. Using the same data, use Cricketgraph™ to make a single scatter graph with a best-fit line for both plants. Label and title.

Data for assignment 1
Plant 1 Plant 2
root length  age in days root length  age in days
1cm 0 0.5cm 0
3 3 1 2
5 6 2 5
6.5 8 3 8
7 10 5 10
8.5 13 8 12
9 14 9.5 14

 
Questions for Assignment 1
a. From the hand drawn graph, what is the slope of the root growth rate for plant 1 over the 14-day period and from day 8 to day 12? Remember to state the slopes in the form of a rate. 

b. From the hand drawn graph, what is the slope of the root growth rate for plant 2 over the 14-day period and from day 8 to day 12? Remember to state the slopes in the form of a rate. 

c. From your line graphs, describe the difference in growth rate of the roots of plant 1 and plant 2. 

d. Compare the line graphs with the scatter graphs. Which graph type best represents this data and why?

Assignment 2
      a. This time construct a column graph using Cricketgraph™ for the California Live Oak, using the data provided for assignment 2.
b. Construct a column graph using Cricketgraph™ for the Coulter Pine, using the data provided for assignment 2. Data for assignment 2 (a count of frequency of age of 2 species of trees in 10 square Km. of forest near the summit of Mount Palomar in San Diego County, CA)
Calif. Live Oak Coulter Pine
# of Trees Age # of Trees Age
27 0 to 25yrs 1 0 to 25yrs
12 26 to 50 2 26 to 50
8 51 to 75 8 51 to 75
45 76 to 100 33 76 to 100
56 101 to 125 43 101 to 125
121 126 to 150 29 126 to 150
237 151 to 175 40 151 to 175
261 176 to 200 36 176 to 200
234 201 to 225 38 201 to 225
278 226 to 250 23 226 to 250
302 251 to 275 18 251 to 275
345 276 to 300 2 276 to 300
326 301 to 325 0 301 to 325
288 326 to 350 1 326 to 350
218 351 to 375 0 351 to 375
223 376 to 400 0 376 to 400
103 401 to 425 0 401 to 425
2 426 to 450 0 426 to 450
0 451 to 475 0 451 to 475
0 476 to 500 0 476 to 500
Questions for assignment 2
a. Are these column graphs good for determining the age interval which contains the greatest and least number of California Live Oaks and Coulter Pines in this forest? Why or why not?
     
    b. Are these column graphs good for determining the average age of California Live Oaks in this forest? Why or why not?

    c. Describe the difference in the frequency distribution of ages of the California Live Oaks and the Coulter Pines in this forest.

Assignment 3

a. Using Cricketgraph™, make a line graph for the data provided. 

Data for assignment 3 (length of small intestine in mouse embryos during development; each data point is an average of 10 mouse embryos)

Length of small intestine Days post conception
0 1
0 2
0 4
0.05cm 5
0.2cm 6
1.2cm 9
2.3cm 11
3.8cm 14
4.1cm 16
4.3cm 18
4.4cm 19
Questions for assignment 3
a. Calculate the slope of the line for the whole experiment, between day 5 and 9 and between day 9 and 14. State the three rates of the development.

b. Does the over all rate of development mean very much or is there a way of stating the rate of development for this experiment that might be more meaningful?

c. Do you think that the data in this graph could be easily extrapolated to day 23? Why or why not?


This lab exercise was developed in part with the support of National Science Foundation (Division of Undergraduate Education) grant # DUE 9552290
and California Community College Chancellor’s Office (Curriculum and Instructional Resources Division, Special Projects) grant # FII 95-621-001.

Back to Top of Page


CCBC is operated by Ventura College.

For more information, please contact: jharber@vcccd.net
Tel: (805) 648-8901   Fax: (805) 648-8988
or see:   Ventura College Home Page

CCBC HOME PAGE


Back to Previous Page