Introduction to R

Download Files

Download Files for today


To save either file…. Right-click –> ‘Save link as…’. Save file to location.

Introductions

Why learn to code?

  • efficiency
  • transparency
  • flexibility in application
  • shareable
  • automated processes/report writing
  • marketable skill
  • needed for publications

Software

What is R?

R is a “suite of software facilities for data manipulation, calculation and graphical display.”


R uses packages that are collections of functions, data, and compiled code in a “well-defined format”.


Packages are downloaded from The Comprehensive R Archive Network (CRAN), R’s central software repository. Also, on GitHub, GitLab, BitBucket or other code sharing platforms.

Why use R?

  • open-source and free
  • small total user base / large in ecology and statistics
  • find help online, e.g., stackoverflow
  • statistics
  • plotting / graphics
  • data management

What is RStudio?

RStudio is an “Integrated Development Environment (IDE)”.


RStudio brings tools/languages together.


We use R within RStudio.

Why use RStudio?

Online resources to learn R

Today

Goal

‘Get familiar with fundamentals of R useful for data’


‘To get beyond the initial shock or fear of programming and start using R’

Today

Learning Objectives

  • Write and execute code in R via RStudio
  • R language vocabulary
  • Read/write data
  • Find help
  • Manipulate data efficiently
  • Plot data/results

Today

Execution

  • Presentation / code walk through
  • Challenges (independent or in teams of 2-3)

Today

Schedule

  • 900 - 930: Introductions and Setup
  • 930 - 1015: RStudio and R (objects and functions)
  • 1015 - 1130: Data Input and Output
  • 1130- 1200: Finding Help
  • 1200 - 1300: Lunch
  • 1300 - 1400: Data Mgmt
  • 1400 - 1500: Plotting
  • 1500 - 1600: Final Challenge

Showcases


Brian - R Shiny application


Kyle - Lights out alerts


Georgia - The Orion Nebula?

RStudio

RStudio

Installing Packages

Packages for Workshop

Please install from CRAN

  • tidyverse
  • readxl
  • ggridges
  • gridExtra
  install.packages(c("tidyverse",
                     "readxl", 
                     "ggridges", 
                     "gridExtra")
                   )

The language of R

Objects

A storage place for information; stored in the “Environment”


‘Attributes’ describes the structure or information of the object

The language of R

Objects

The language of R

Objects

# y is an 'object' that is assigned the value 3
y = 3
y
[1] 3


# Same operation '=' '<-'
y <- 3

The language of R

Objects

# We can create new objects from objects
y2 = y-2
y2
[1] 1


# We can do math with our objects
# Mind your parentheses (order of operation)
y*2 / y*4
[1] 8
y*2 / (y*4)
[1] 0.5

The language of R

Functions

‘does stuff’; creates or manipulates objects

‘Arguments’ are the types of things a function is asking for; the inputs

The language of R

object = function(argument = input1, argument = input2)


object = function(input1, input2)


this = sign(x = -5)


sign(-5)
[1] -1
sign(5)
[1] 1

The language of R

Functions

# function - 'c' - concatenate
y = c(1,2,3,4,5,6)


is.numeric(y)
[1] TRUE


# The function 'class' has the argument 'x'
is.numeric(x = y)
[1] TRUE

The language of R

Functions

# How to find out the arguments of a function?
?is.numeric

The language of R

Wrapping functions

# Functions are commonly 1) wrapped, 2) have multiple arguments
x = matrix( 
            data = c(1,2,3,4,5,6),
            nrow = 2,
            ncol = 3
            )
x
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

The language of R

Values

  • numeric
  • integer
  • character
  • factor

Objects

  • vector
  • matrix
  • array
  • list
  • dataframe
  • S3, S4, S5, and beyond

Types of Values

Numeric

y = 3
class(y)
[1] "numeric"


Integer

y = integer(3)
class(y)
[1] "integer"


Character

y = "habitat"
class(y)
[1] "character"


Factor

y = factor("habitat")
class(y)
[1] "factor"

Types of Objects

Vector

# An ordered collection indexed 1,2,...n
# Using the function 'c' to concetanate
z1 = c(4,5,6)
z1
[1] 4 5 6

The value 4 is in element/index/position 1 of the vector

The value 6 is in element/index/position 3 of the vector


# the dimension of a vector
length(z1)
[1] 3


# A vector of characters
z2 = c("dog","cat","horse")
z2
[1] "dog"   "cat"   "horse"


z3 = c("dog","1","horse")
z3
[1] "dog"   "1"     "horse"

:::

Types of Objects

Subsetting a vector

z3 = c("dog",
       "1",
       "horse",
       "chicken"
       )
z3[2]
[1] "1"


2:4
[1] 2 3 4


z3[2:4]
[1] "1"       "horse"   "chicken"


z3[c(2,4)]
[1] "1"       "chicken"


z3[-1]
[1] "1"       "horse"   "chicken"

Types of Objects

Vector of factors

z4 = factor(
            c("dog", 
              "dog", 
              "cat",
              "horse"
              )
           )


z4
[1] dog   dog   cat   horse
Levels: cat dog horse


levels(z4)
[1] "cat"   "dog"   "horse"


summary(z4)
  cat   dog horse 
    1     2     1 

Types of Objects

Matrix

x = matrix(
            c(1,2,3,4,5,6),
            nrow = 2, 
            ncol = 3
           )


x
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6


#rows and columns
dim(x)
[1] 2 3


Types of Objects

Subsetting a matrix

# get element of row 1 and column 2
x[1,2]
[1] 3


# get all elements of row 2
x[2,]
[1] 2 4 6


# same as
x[2,1:3]
[1] 2 4 6

Types of Objects

Array

# ARRAY - more than two dimensions
z5 = array(
            c("a","b","c","d","1","2","3","4"), 
            dim = c(2,2,2)
           )


z5
, , 1

     [,1] [,2]
[1,] "a"  "c" 
[2,] "b"  "d" 

, , 2

     [,1] [,2]
[1,] "1"  "3" 
[2,] "2"  "4" 

Types of Objects

List

# LIST - a bucket - will take anything
my.list = list(z1, z2, z3, z4, z5)


#Subset a list
my.list[[1]]
[1] 4 5 6


my.list[[4]]
[1] dog   dog   cat   horse
Levels: cat dog horse

Types of Objects

Data frame

E.g., a row for each observation and a column for each variable (can be different types).

x = data.frame(outcome = c(1,0,1,1),
               exposure = c("yes", "yes", "no", "no"),
               age = c(24, 55, 39, 18)
               )
x
  outcome exposure age
1       1      yes  24
2       0      yes  55
3       1       no  39
4       1       no  18

Types of Objects

Subset data.frame

x$exposure
[1] "yes" "yes" "no"  "no" 


x['exposure']
  exposure
1      yes
2      yes
3       no
4       no


x[,2]
[1] "yes" "yes" "no"  "no" 






Next:
Data input and output (Kyle)