In this lab you will become familiar with the joys of the WSFR network (if you aren't already). Mainly, you need 1) have a userid and password, to allow access to certain network programs; 2) know how to access and copy files from central network locations (a class directory) to your local directory (directory= 'folder' in windows jargon), 3) a few more odds and ends that will become evident as we proceed.
In most cases, we will start labs by doing what we're doing now- accessing a web page. In many cases the programs that we will run will be imbedded as files accessed from the web page, simply by clicking on a hilited link. E.g., clicking on runme.xls should open Microsoft Excel and run the previously-named file. In other cases we will want to save files (often these will be in self-extracting archives) and then open these using programs such as R, MARK, or other programs. In general, it is in fact usually better to save even Excel files and open these directly in Excel, rather than using the web browser to open the program. The general procedure is 1) right click on the link, 2) save the indicated file to a network or local drive, 3) run the necessary program from the program task bar in Windows.
For this lab we will use two commonly-available programs, Microsoft Excel and R. There are a number of reasons why we will use Excel to do many of the tasks in lab: 1) it's available, 2) I know it, and you may already know it, 3) it's easy and transparent, 4) has many built in functions and graphics features, and 5) it does a whole lot more than I ever thought it would. This is not to say that Excel is perfect or ideal-- it has many limitations, and is extremely inefficient for very large problems. It is no substitute for real computer languages or statistical software in a research environment. For example, I would not recommend using the random number generator for a serious research application (it is fine for teaching purposes). However, it gives me a pretty convenient 'electronic blackboard' for getting across many of the principles in class, and it encourages (requires?) you to be more engaged in thinking about what you are doing, as opposed to cookbook, plug-and-chug modeling and statistical software.
On the other hand, Excel, being a spreadsheet program, forces one into a 2-dimensional view of modeling that is really very limiting. In addition, most models that we will use are iterative in nature: that is the processes involved, such as birth and death, occur over and over again, although the values may change. This makes the computational device known as a "do loop" a natural for population modeling. Some of you already know much more about this than I do. For those of you who don't, we're going to use R, which is an easy-to-use (and free, open-source) programming language, to do the lifting. However, if you understand the basic approaches that we use in R you will be in a very good position to apply that knowledge in more advanced languages such as BASIC or C, or for that matter Python or SAS (which I formerly used for these labs).
We're going to use both approaches to solve 2 problems that we will commonly encounter in this course: 1) dynamic population growth, and 2) finding the maximum (or minimum) of a function.
I. SIMPLE POPULATION GROWTH
First read the brief discussion of Population growth models; then return here for the Excel and Python exercises.
Coding in Excel
First a few Excel 'rules'.
'A1', 'B2', etc. refer to column/row labels, and will be used to keep track of some of our numbers; for instance these can be used as variable names in an equation. These are known as 'relative cell references', and will change if you copy or move cells from one location to another. So for example if you refer to 'A4' in an equation, and then copy the contents of that cell to another location, (say 'P9') the equation will now refer to the new location. Another type of reference is an 'absolute' reference, denoted by '$A$1', '$B$2' etc. Here the reference stays fixed to the original cell location regardless of copying, etc. I find relative cell references handy when I am trying to repeat a calculation in different rows or columns, as is common in iterative procedures like our modeling. On the other hand if you always want the reference to stay put to the same cell, use the absolute cell reference. This might include cases where the cell takes on a fixed value such as parameter in a model; however, in these cases I prefer named variables. I'll explain how these work in the spreadsheet example.
Ok, go to the spreadsheet models.xls, and look in column 'A'. Keep your eye on the equation box (after the '=' sign). In the top row, nothing is happening: just labels for time and abundance. Go down to A2. This just contains the first time period, which is zero. In A3 things are more interesting: '=A2+1'. This expression tells the program (yes, it's a program) to 'take the contents of cell A2 and add 1'. This repeats all the way down the column to time 21 (we could have kept going but my mouse finger ran out). Now, this is sort of a model, but a trivial one: we are just counting by ones. I could have typed all these numbers in by hand, but by copying cell A3 to the other cells, I saved myself the work. I used some of the time saved to continue to work on the novel I am writing. Now go to column B. Notice in cell B2 that '=n_0' appears in the equation box, but '100' appears in the cell. How did this happen? Check out columns G and H. Here I used named variables. What I did was go to H2 and H3, go to the menu toolbar and hit 'insert' , then 'name' , and then the names that appear in column G (these are both labeled as 'parameters' even though n_0 is really an initial condition, not a parameter). Now whenever I used 'r_max' or 'n_0' in an equation, the equation will pick up the values in H2 and H3, which are 0.1 and 100 now, but could be anything I change them to. Pretty slick, huh? Now lets see how we used this. Go to B3. The equation box says 'B2*(1+r_max)' This will take my previous value of B2 (100) and multiply it by (1+r_max)=1.10. The result is 110, printed in B3. In B4 we have 'B3*(1+r_max)', etc. Note that this is exactly the same process as expressed in
equation (1), that is
N(t+1) = (1 + r ) N(t)
is just coded as
Bn+1=Bn*(1+r_max)
where n is the row number (corresponding to time).
Coding in R
First, you will need to save and extract the archive containing the R code needed to run our example models. Save these files to a convenient place, like the C:\temp directory.
Programming this model in R is very simple, if you keep a few basic concepts in mind. These are illustrated in program file popmod.R. First open R, and then import the file popmod.R First, the program initialized population size (N) at 100 ( N<-100); note that the left arrow is used to assign values to a variable. We specify years <-20 for 20 years of growth, and a constant growth rate of r<-0.10. The next statement is a “do” or “for” loop that updates population size in year i (for 1 to 20 years) by multiplying by the constant (r+1); note that the expression N[i+1]<-N[i]*(r+1) is in curly brackets, indicating at it repeats as long as i <20. This is equivalent to the algebraic model
Nt+1 = (1+rmax)Nt
We can specify any number of years to run the model, but let’s be reasonable and run it for 20 years. The basic model
is done by appending the value of Nt+1 each year. Once the loop is completed, the N data are put into a matrix and plotted over time, creating a nice, smooth plot of abundance over time. The program also prints values of abundance at each year but this can be suppressed (simply by deleting or commenting out the print line).
The very same model has also been programmed in Python. Either program can be summarized by the
following flowchart: