Class 1: Jan 24


1. Installation


      - tcl/tk is a programming language; tk is a standard GUI format; you probably want it

      - customize: MDI vs SDI (I prefer SDI); others up to you

      - can have multiple R installations (different versions); can have same version running in multiple workspaces

      - take a look through initial GUI menus: scripts, workspaces, history


2. Command Line

      - the prompt; calculations, related math functions, e=factors of 10

            log, exp, log10, pi, cos, sin, round, abs, factorial

      -everything is recorded in the workspace so it can be saved.  History is a text file, while workspaces will have all objects saved (much larger)

      - assignment: (x=9 or x<-9)

            - note variable names cannot start with 1, or have spaces, and are case-sensitive

            - assignments will print if the line is enclosed by parentheses

            - if not a number entered, it is believed to be a function or assigned variable

      - creating vectors: c( ), rep, seq

      - logical assignments

            - logical commands: <, >, <=, >=, ==, !, &, |(greater than, less than, greater than or equal to, less than or equal to, if both, not, and, nor)

            - logical operates return logical vectors (eg x<10) converts everything to T-F

            - huge advantage of this is that subsequent assignment returns only TRUE values; this is the essence of subsetting

            - as.numeric turns T-F back to 0 and 1s

            - is.element for comparing vectors of different lengths

            - union (puts values together), intersect (in common), setdiff (not in common)


      - character vectors

            - use of quotes; note letters() and LETTERS().  Case sensitive, giving letters in specified case. 

            - c() vs paste().  c() is concatenate, combining all into a vector;  paste() puts characters together within one or elements


3. Scripts

      - editor in R (open new script)

      - other editors (Textpad); syntax highlighting

      - editors with execution (Tinn-R)

      - note on the workspace: save if you want to keep all defined objects when you next open R.  If saved as workspace saves all objects, much larger files.

            - recommendation: instead create text file (script) that records all of your work, including data input and the creation of all objects; this can be re-executed each time


4. Help

      - ? is basic for functions

      - R manuals on CRAN; R search websites (eg on Fridley web page)

      -" ") for finding text strings, all loaded libraries

      - apropos(" ") gives all functions with that text string in them

      - find(" "): tells you where (what package) to find a particular function (loaded libraries only)

      - R function reference cards (handout, several available online)


5. Data input

      - read.table and read.csv; recommendation is read.csv (easiest to use csv files, and save excel sheets in that format)

      - note path syntax (\\ or /)

      - also handy is file.choose function: read.csv(file.choose()) allows you find where it was saved on the computer

            - can change working directory so don't have to specify paths each time: setwd()

      - useful: scan("clipboard"); also scan() for manual input

            - can also write to clipboard with writeClipboard(as.character( ))  This is useful for smaller data sets.

      - To name when loading the dataset put in: name = read.csv()

      - look at arguments to read.table: header, start row, colClasses

      - import our dataset: save to machine, then use read.csv

      - having a look around: "str" function is REALLY handy.  Gives each column class.

      - note 'edit' and 'fix' functions; but use Excel

      - 'objects()' lists available objects you created

      - data output: write and write.csv


6. Data frames and variables

      - variable classes: numeric, character, factor, Date, logical

      - understand the basic format: response variables and explanatory variables get their own columns

      - $ for columns of data frames and nested objects within other objects

      - names(); rownames; colnames

      - indexing

            - lists ([[,]]) vs others ([,]; note blank subscipt means 'everything' ie x[,1] means all rows, first column

            - negative subscripts mean 'do not include'

            - making a matrix, changing to a matrix; rownames, colnames; dim, diag; apply for matrices; row, col functions; rbind

            - length and dim

      - making subsets: logical or 'subset'

      -Example we used:

            settled = subset(dat,dat$disturb==”settle”)

      - sort dataframes with the order function (note rev function); also multiple columns in order listed in order(); eg df[order(x,y),]

      - na.omit and na.exclude gets rid of rows with any NAs (basically the same)

      - you can always add columns to a df by just defining a new column with df$new = ; this also applies to calculated columns (eg, log(fd$x) )

      - create dataframes with the data.frame function

      - Date objects

            - Sys.time(), date()

            - you can compare dates if they are Date objects; also difftime

            - note Date format; eg, Excel usually has "month/day/year", which is "%d/$m/%Y"

            - best part about date objects is graphing with them: pretty labels


7. Simple graphing

      - plot(x,y) is the basic function for x vs. y

            - categorial x variables: box and whisker plots; note how notches that look weird indicate bad tests

      - points, lines, text (give you symbols instead of the basic dot)

      - plot arguments: cex, cex.axis, cex.lab, pch, col, ylab, xlab, xlim, ylim, main, sub, lty, lwd,

      - colors() lists names of all available colors


8. Installing packages (used to be 'libraries' in S)

      - easiest way is via GUI; select portal and package(s)

      - also install.packages(" ")

      - search() tells which attached

      - more advanced ways of loading other functions not in est. libraries; eg "source" to load code

      - to load the library in command line enter: library()