# Chapter 4 Data Structures

Data Structures are about different ways of organizing data. Before looking at the different data structures, I want to talk a little bit about different *attributes* that an object can have. Not every object in `R`

necessarily has attributes, but attributes can be part of an object in `R`

. Some of the most common types of attributes that we will encounter are `names`

or `dimnames`

, `dimension`

, and `length`

. There might be some other user defined attributes. For example, a matrix will have dimensions: it will have a number of rows and a number of columns; if you have a multidimensional array you will have more than two dimensions.

Attributes of an object can be accessed using the `attributes()`

function. This function allows us to set or modify the attributes for an `R`

object.

## 4.1 Vectors

The most fundamental object in `R`

is called a vector. In a vector we can store multiple copies of a *single* type of object. So you can have a vector of characters or a vector of integers, one thing you cannot do with a standard vector is to have mixed types of objects. For example, you cannot have a vector of characters and numerics, or numerics and integers, or integers and logicals, etc. Everything in a vector has to be the same class.

In summary, a vector is a collection of elements, which are *all of the same type*.

**Creating Vectors**

The concatenate (or combine) function `c()`

can be used to create a vector of objects. In the following example, we create each vector with a different type of data:

```
> a <- c(3, -1.6) ## numeric vector
> b <- 1:10 ## integer vector
> d <- c("a", "b", "c") ## character vector
> e <- c(1+0i, 2+4i) ## complex vector
> f <- c(TRUE, FALSE) ## logical vector
```

We can check the class of these vectors with the `class`

function.

**Accessing Elements of a Vector**

To access a certain subset of the elements of a vector we use the following structure: `<vector>[<indices of desired elements>]`

. The following example demonstrate various possibilities:

```
# create a vector of integers
x <- 1:10
# accessing to only a single element:
x[7]
## [1] 7
# accessing to all elements between the first and fifth:
x[1:5]
## [1] 1 2 3 4 5
# accessing to only the first and fifth element:
x[c(1,5)]
## [1] 1 5
```

Note that in the last case, we first created an auxilarly vector with elements 1 and 5 and then fed this into `x`

to extract the first and fifth elements of `x`

.

Here we collect a number built-in `R`

functions that can operate on vectors:

```
x <- 1:50
# __ Built-in Functions for a Numeric Vector__
length(x)
## [1] 50
mean(x)
## [1] 25.5
var(x)
## [1] 212.5
sum(x)
## [1] 1275
max(x)
## [1] 50
summary(x)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 13.25 25.50 25.50 37.75 50.00
```

Here `x`

is clearly an integer vector and the `R`

functions above (`length`

, `mean`

, etc.) treats `x`

accordingly.

Now, suppose that we have a data set on employmeny status of 6 people

`x <- c("E", "U", "E", "E", "E", "U") # E: employed, U: unemployed`

This is clearly character vector and if we try to call these same functions on `x`

we would get error, except `length`

and `summary`

are still legitimate functions.

```
x <- c("E", "U", "E", "E", "E", "U") # E: employed, U: unemployed
length(x)
## [1] 6
summary(x)
## Length Class Mode
## 6 character character
```

Note that the output of the `summary`

function depends on whether the vector is a numeric or character vector.

**Vector Operations**

`R`

is a *vectorized language*, which means that operations are applied to each element of the vector automatically, without the need to loop through the vector. In another word, `R`

can operate on the two vectors at a single strike. This feature makes it a very efficient computational tool. Let’s

```
x <- 1:5
x + 1
## [1] 2 3 4 5 6
x*5
## [1] 5 10 15 20 25
sqrt(x)
## [1] 1.000000 1.414214 1.732051 2.000000 2.236068
rep(1, times = 10)
## [1] 1 1 1 1 1 1 1 1 1 1
rep(x, times = 2)
## [1] 1 2 3 4 5 1 2 3 4 5
```

Operating on two vectors of the same length:

```
x <- c(1,2,3,4,5,6)
y <- 3:8
x + y
## [1] 4 6 8 10 12 14
```

Things get a little more complicated when operating on two vectors with different lengths. In this case, the shorter vector gets *recycled*—that is, its elements are repeated, in order, until they have been matched up with every element of the longer vector. An example will clarify this important point:

```
x <- c(1,2,3,4,5,6)
y <- c(30,40,50)
x + y
## [1] 31 42 53 34 45 56
```

If the longer vector is not a multiple of the shorter one, a warning is given but it still makes the computation in the same way:

```
x <- c(1,2,3,4,5,6)
y <- c(30,40,50,60)
x + y
## Warning in x + y: longer object length is not a multiple of shorter object
## length
## [1] 31 42 53 64 35 46
```

Comparisons also work on vectors elementwise:

```
x <- c(1,2,3,4,5,6)
y <- c(1,2,3)
x <= y
## [1] TRUE TRUE TRUE FALSE FALSE FALSE
```

Note that the shorter vector is again gets recycled.

There are some very usefull built-in comparison functions in `R`

: `any`

and `all`

```
x <- 1:4
y <- 2:5
# any and all return a single logical value, either TRUE or FALSE
any(x>y)
## [1] FALSE
all(x<y)
## [1] TRUE
# if you want to compare two vectors componentwise
x <= y
## [1] TRUE TRUE TRUE TRUE
## if you want to save the result as a numeric value
z <- as.numeric(x <= y)
z
## [1] 1 1 1 1
```

**Combining Two Vectors into a Single Vector**

```
x <- 1:3
y <- 40:45
z <- c(x, y)
z
## [1] 1 2 3 40 41 42 43 44 45
z <- c(y, x) # order of the vectors matters
z
## [1] 40 41 42 43 44 45 1 2 3
```

### 4.1.1 Mixing Objects and Coercion

**Implicit Coercion**

What happens if you create a vector by mixing two different types of objects? `R`

will not give you an error but it will *coerce* the vector to be the class that is kind of the least common denominator. When different objects are mixed in a vector, *coercion* occurs so that every element in the vector is of the *same class*. Let’s look at the following example

```
y <- c(1.7, "a") ## coerced to a character vector
y <- c(TRUE, 2) ## coerced to a numeric vector
y <- c("a", TRUE) ## coerced to a character vector
```

In the first example, I have got in trouble concatenating number `1.7`

and letter `a`

, so clearly these are not in the same class; `1`

is `numeric`

, and `"a"`

is `character`

. So the least common denominator is going to be character so `y`

is going to be a character vector. In the second example, we are going to get a numeric vector and the `TRUE`

is going to be converted into a number. By the convention in `TRUE`

is represented as the number 1 in `R`

and `FALSE`

is represented as the number 0. And so what we are going to get here, is a vector 1, 2. In the last example, we are concatenating the letter `a`

and the `TRUE`

which is `logical`

. Here the least common denominator is again going to be character. And so the vector that we end up with is a vector where the first element is `"a"`

and the second element is the *string* `"TRUE"`

.

You need to be a little bit careful when you mix different types of elements in a vector because you won’t get an error, but, but the coercion will happen behind the scenes and the result might not what want you wanted.

**Explicit Coercion**

So far we have talked about the implicit coercion that occurs behind the scenes, but we can *explicitly* coerce objects from one class to another using functions that usually start with the word `as`

. So for example, if you want to convert something to a numeric you can use the function called `as.numeric`

. If you want to convert something into character you can use the function `as.character`

. Let us look at the following example:

```
x <- 0:6
class(x)
## [1] "integer"
as.numeric(x)
## [1] 0 1 2 3 4 5 6
as.logical(x)
## [1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE
as.character(x)
## [1] "0" "1" "2" "3" "4" "5" "6"
```

Here, we are creating first an object called `x`

which is a sequence of 0 to 6. This is going to be an integer sequence and as you can see when we call `class`

on the object it says `integer`

. But we can convert it into a numeric sequence by calling `as.numeric(x)`

as we did next in the example. As you can see it prints out 0, 1, through 6, which look like an integer object but it is actually going to be numeric object now. I can also convert it into a logical object by calling `as.logical`

on it. In this case, since `0`

is `FALSE`

by convention, `0`

will correspond to `FALSE`

and any number that is greater than zero is going to be `TRUE`

. Lastly, when I call `as.character`

on `x`

it converts all the numbers into characters. So now we got the string zero, `"0"`

, the string one, “1”, and so forth.

The explicit coercion doesn’t always work and when it doesn’t work you get what are called `NA`

values. In another word, non sensical coercion will result in `NA`

s. So for example if we try to convert the vector `c("a", "b", "c")`

into a `numeric`

object by calling `as.numeric`

we will end up with a vector of `NA`

s as shown in the following example. This happens because there is really no way to convert the letters `a`

, `b`

, and `c`

to numerical variables, i.e. `R`

has no clue about numerical correspondence of these letters. Similarly if you call `as.logical`

on `x`

, you are again going to get a vector of `NA`

s.

```
x <- c("a", "b", "c")
as.numeric(x)
## Warning: NAs introduced by coercion
## [1] NA NA NA
as.logical(x)
## [1] NA NA NA
as.complex(x)
## Warning: NAs introduced by coercion
## [1] NA NA NA
```

### 4.1.2 Factor Vectors

Factors are used to represent categorical data. Factors can be ordered or unordered and are an important class for statistical analysis and for plotting. Factor vectors are treated differently by some `R`

functions, for example by regression functions like `lm()`

and `glm()`

.

The function `factor`

is used to encode a vector as a factor, i.e. factors the data into different categories:

```
x <- factor(c("yes", "yes", "no", "yes", "no"))
x
## [1] yes yes no yes no
## Levels: no yes
table(x)
## x
## no yes
## 2 3
unclass(x)
## [1] 2 2 1 2 1
## attr(,"levels")
## [1] "no" "yes"
attr(x,"levels")
## [1] "no" "yes"
```

So `x`

is a factor, you can see what, it prints out a little bit differently from a character vector, in the sense that it prints up the value, `yes, yes, no, yes, no`

. And then it has a separate attribute which is called the `levels`

. And the levels of this factor are `no`

and `yes`

. I can call `table`

on this factor and it will give me a frequency count of how many of each level there are. So for example, it’ll tell me there are two nodes. And there’s three `yes`

es. The `unclass`

function strips out the class for a vector. So for example, if I call `unclass`

on `x`

it will bring it down to an `integer`

vector. The factors represent as `2 2 1 2 1`

. So it’s really an `integer`

vector with the `levels`

attribute of `no`

and `yes`

.

Note that the input into the factor function should be a character vector.

Factors are stored as integers, and have labels associated with these unique integers. While factors look (and often behave) like character vectors, they are actually integers under the hood, and you need to be careful when treating them like strings.

Once created, factors can only contain a pre-defined set values, known as levels.

```
x <- c("Good", "Bad", "Excellent", "Good", "Bad")
x <- factor(x)
x
## [1] Good Bad Excellent Good Bad
## Levels: Bad Excellent Good
```

`R`

stores the vector `x`

internally as `(3,1,2,3,1)`

and associates it with `1 = Bad`

, `2 = Excellent`

and `3 = Good`

. This can be seen with by calling `as.numeric`

on `x`

```
as.numeric(x)
## [1] 3 1 2 3 1
```

By default, factor `levels`

for character vectors are created in alphabetical order but you can override the default by specifying a `levels`

option. For example, in the previous example it would be more natural to order as `Bad < Good < Excellent`

:

```
x <- c("Good", "Bad", "Excellent", "Good", "Bad")
x <- factor(x, order=TRUE, levels=c("Bad", "Good", "Excellent"))
x
## [1] Good Bad Excellent Good Bad
## Levels: Bad < Good < Excellent
as.numeric(x)
## [1] 2 1 3 2 1
```

```
factor(x=c("High School", "College", "Masters", "College", "Doctorate"),
levels=c("High School", "College", "Masters", "Doctorate"),
ordered=TRUE)
## [1] High School College Masters College Doctorate
## Levels: High School < College < Masters < Doctorate
```

### 4.1.3 Special Built-in Vectors

`R`

has many built-in functions that can create vectors easily. For example we have already seen that the expression `1:5`

creates a vector of integers from 1 to 5. Here, `:`

is as a built-in operator.

In this section, we mention some other special vectors built-in in `R`

.

**Letters**

For example, functions `letters`

and `LETTERS`

generate the lower case and upper case letters, respectively. These special vectors might be useful in some cases:

```
x <- 1:10
names(x) <- LETTERS[1:10]
x
## A B C D E F G H I J
## 1 2 3 4 5 6 7 8 9 10
```

**Empty Vector**

We can create an empty vector with the `vector`

function. The `vector`

function has two basic arguments. The first argument is the class of the object that you want to have in the vector and the second argument is the length of the vector itself.

```
x <- vector("numeric", length = 10)
x
## [1] 0 0 0 0 0 0 0 0 0 0
```

## 4.2 Matrices

Vectors do not have a dimension, meaning there is no such thing as a column vector or a row vector. These vectors are not like the mathematical vectors, where there is a difference between row and column orientation.

Techically speaking, vectors do not have a *dimension* attribute. But if we need data with row and column orientation `R`

has another class of objects called `matrice`

that are simply vectors with a *dimension* attribute.

There are several different ways of creating matrices.

**Building Matrices Using cbind() or rbind()**

We can build matrices from vectors by using the `cbind()`

or `rbind()`

functions:

```
x <- 21:25
A <- cbind(x, 1:5)
A
## x
## [1,] 21 1
## [2,] 22 2
## [3,] 23 3
## [4,] 24 4
## [5,] 25 5
y <- letters[1:5]
A <- rbind(x, y)
A
## [,1] [,2] [,3] [,4] [,5]
## x "21" "22" "23" "24" "25"
## y "a" "b" "c" "d" "e"
```

How is the second matrix different than the first one?

```
class(y)
## [1] "character"
dim(y)
## NULL
```

All columns in a matrix must be of the same type (numeric, character, etc.), but diffent columns might contain different types of data. So if you try to construct a matrix out of two vectors of different classes, `R`

will give you a *coerced* matrix. This is very similar to vector coercion we saw before.

**Building Matrices Using matrix()**

There is another way of creating a matrix by using the `matrix`

function"

```
A <- matrix(1:6, nrow = 2, ncol = 3)
A
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
A <- matrix(1:6, nrow=2) # R can figure out the column dimension
A
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
```

Matrices are constructed *column-wise*, so entries can be thought of starting in the upper left corner and running down the columns.

**Building Matrices Using dim()**

We can also create a `matrix`

by giving a vector the dimensions that match the length of the vector:

```
A <- 1:6 # a vector
dim(A) <- c(3,2) # add a dimension attribute
A # a matrix
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
#dim(A) <- c(3,3) #try this. you will get an error.
```

**Referencing via Brackets:**

General structure: `<matrix>[<row indices>,<column indices>]`

```
A <- matrix(1:6, nrow=2)
A[2,3]
## [1] 6
A[, 1]
## [1] 1 2
A[2, ]
## [1] 2 4 6
```

Matrices act similarly to vectors with element-by-element addition, multiplication, subtraction, division and equality.

```
A <- matrix(1:6, nrow=2)
B <- matrix(11:16, nrow=2)
A+B
## [,1] [,2] [,3]
## [1,] 12 16 20
## [2,] 14 18 22
A*B
## [,1] [,2] [,3]
## [1,] 11 39 75
## [2,] 24 56 96
A/B
## [,1] [,2] [,3]
## [1,] 0.09090909 0.2307692 0.3333333
## [2,] 0.16666667 0.2857143 0.3750000
```

Note that `A*B`

does not give us the usual matrix multiplication. Recall that matrix multiplication requires the number of columns of the left-hand matrix to be the same as the number of rows of the right-hand matrix. Try the following

```
A <- array(1:12, c(2,3,2))
A
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 7 9 11
## [2,] 8 10 12
```

There are also objects called **arrays** that can have more than two dimensions. These objects can be created by `array`

function

```
A <- matrix(1:6, nrow=2)
B <- matrix(11:16, nrow=2)
B = t(B) # compute transpose B
A %*% B # matrix multiplication works
## [,1] [,2]
## [1,] 125 134
## [2,] 164 176
```

## 4.3 Data Frames

Data frames are the most common data structures for statistics that are used to store tabular data. A data frame is a rectangular object like a matrix but can hold different types of data in different columns. Every column represents a variable (or a factor) in the dataset and every row represents a case, either an object or an individual about whom data have been collected.

In a data frame, each column is a vector, each of which has the same length. This is very important because it lets each column hold a different type of data. This also implies that within a column each element must be of the same type, just like with vectors.

The simplest way to construct a Data Frame is to use the `data.frame`

function:

```
v1 <- c('UT', 'IA', 'MN', 'AL', 'WI')
v2 <- c(4.6, 4.8, 5.2, 6.3, 6.8)
df <- data.frame(v1, v2)
df
## v1 v2
## 1 UT 4.6
## 2 IA 4.8
## 3 MN 5.2
## 4 AL 6.3
## 5 WI 6.8
```

Notice the names of `df`

are simply the vector names. We could have assigned names during the creation process:

```
df <- data.frame(States = v1, Scores = v2)
df
## States Scores
## 1 UT 4.6
## 2 IA 4.8
## 3 MN 5.2
## 4 AL 6.3
## 5 WI 6.8
```

We can learn about a data frame using various functions:

```
v1 <- c('UT', 'IA', 'MN', 'AL', 'WI')
v2 <- c(4.6, 4.8, 5.2, 6.3, 6.8)
df <- data.frame(v1, v2)
df
## v1 v2
## 1 UT 4.6
## 2 IA 4.8
## 3 MN 5.2
## 4 AL 6.3
## 5 WI 6.8
nrow(df) # number of rows in df
## [1] 5
ncol(df) # number of colums in df
## [1] 2
dim(df) # dimensions of df
## [1] 5 2
names(df) # checking names in df
## [1] "v1" "v2"
class(df) # check class of df
## [1] "data.frame"
str(df) # learn more about structure of df
## 'data.frame': 5 obs. of 2 variables:
## $ v1: Factor w/ 5 levels "AL","IA","MN",..: 4 2 3 1 5
## $ v2: num 4.6 4.8 5.2 6.3 6.8
```

`head`

and `tail`

Functions

Usually a data frame has far too many rows to print them all to the screen, so thankfully the head function prints out only the first few rows. Similarly, the `tail`

function prints only the specified number of last rows. After you read in data using these two functions together to get a sense of data is a good idea.

```
v1 <- c('UT', 'IA', 'MN', 'AL', 'WI')
v2 <- c(4.6, 4.8, 5.2, 6.3, 6.8)
head(df, n=3) # only prints the first three rows
## v1 v2
## 1 UT 4.6
## 2 IA 4.8
## 3 MN 5.2
tail(df, n=2) # only prints the last two rows
## v1 v2
## 4 AL 6.3
## 5 WI 6.8
```

There are multiple ways of accessing a column of a data frame

**Referencing columns (variables) via names:**

```
v1 <- c('UT', 'IA', 'MN', 'AL', 'WI')
v2 <- c(4.6, 4.8, 5.2, 6.3, 6.8)
v3 <- c(10, 20, 30, 40, 50)
df <- data.frame(v1, v2, v3)
df$v1
## [1] UT IA MN AL WI
## Levels: AL IA MN UT WI
```

**Referencing via brackets:**

```
df[3,2]
## [1] 5.2
df[3,2:3]
## v2 v3
## 3 5.2 30
df[3,c(1,3)] # picks elements on the third row and the first and third columns
## v1 v3
## 3 MN 30
df[, 1] # returns the first column
## [1] UT IA MN AL WI
## Levels: AL IA MN UT WI
df[2, ] # returns the second row
## v1 v2 v3
## 2 IA 4.8 20
df[, 1:2] # returns the first two columns
## v1 v2
## 1 UT 4.6
## 2 IA 4.8
## 3 MN 5.2
## 4 AL 6.3
## 5 WI 6.8
df[,"v2"] # returns the second column
## [1] 4.6 4.8 5.2 6.3 6.8
df["v2"] # returns the second column as a data frame
## v2
## 1 4.6
## 2 4.8
## 3 5.2
## 4 6.3
## 5 6.8
```

Note that all of these methods might return different outputs: some return a vector and some return a single-column data frame. To ensure a single-column data frame while using single square brackets, we need to use a third argument: `drop=FALSE`

. This also works when specifying a single column by number. Compare the output of the following cases:

```
df[, 1] # returns a vector
## [1] UT IA MN AL WI
## Levels: AL IA MN UT WI
df[, 1, drop=FALSE] # returns a single-column data.frame
## v1
## 1 UT
## 2 IA
## 3 MN
## 4 AL
## 5 WI
df[, "v2"] # returns a vector
## [1] 4.6 4.8 5.2 6.3 6.8
df[, "v2", drop=FALSE] # returns a single-column data.frame
## v2
## 1 4.6
## 2 4.8
## 3 5.2
## 4 6.3
## 5 6.8
```

**Referencing via $:** The symbol

`$`

can be used to identify certain columns:```
df$v2 # returns the second column
## [1] 4.6 4.8 5.2 6.3 6.8
```

For example, if you want to cross-tabulate

```
v1 <- c('UT', 'IA', 'MN', 'AL', 'AL', 'IA', 'MN', 'MN')
v2 <- c("A", "B", "A", "A", "B", "B", "A", "A")
df <- data.frame(v1, v2)
table(df$v2, df$v1)
##
## AL IA MN UT
## A 1 0 3 1
## B 1 2 0 0
```

Some other features of data frames:

Data frames have a special attribute called

`row.names`

. So every row of a data frame has a name. This can be useful for annotating the data. For example, each row might represent a subject enrolled in a study, and then the row names would be the subject ID for example.Data frames are usually created by calling

`read.table()`

or`read.csv()`

Data frames can be converted to a

`matrix`

by calling the function`data.matrix()`

Hadley Wickham’s package

**dplyr**has an optimized set of functions designed to work efficiently with data frames. We will see this package in a later chapter.

## 4.4 Lists

The next data structure that we are going to talk about is the list. Lists are very helpful when we have ragged data arrays in which the variables have *unequal numbers of observations* or in which variables are of *different types*. In another word, lists store any number of items of any type; a list can contain all numeric data or characters or a mix of the two or data frames or other lists. That feature makes lists very handy for carrying around different types of data.

Lists are created with the `list`

function where each argument to the function becomes an element of the list:

```
x <- list(1:2, 5 + 2i, "hello", TRUE)
x
## [[1]]
## [1] 1 2
##
## [[2]]
## [1] 5+2i
##
## [[3]]
## [1] "hello"
##
## [[4]]
## [1] TRUE
```

So we created a list called `x`

by using the `list`

function. The first element is a numeric vector; the second element is a complex number; third is a character and the fourth is a logical data. Note that it doesn’t print out like a vector because every element is different. The elements are indexed by double brackets. Elements of a list will have double brackets around them wheras, elements of other vectors have just the single brackets, so that is one way to differentiate a list from other types of vectors.

Another example:

```
v <- c(4.6, 4.8, 5.2, 6.3, 6.8)
A <- matrix(1:10, 5)
L1 <- list(v, df, A)
class(L1)
## [1] "list"
```

This list contains a vector, data.frame and matrix.

Lists can have names just like other data types. Each element has a unique name that can be either viewed or assigned using `names`

:

```
names(L1)
## NULL
names(L1) <- c("vector", "data.frame", "matrix")
names(L1)
## [1] "vector" "data.frame" "matrix"
# another way to assign names
L1 <- list("vector" = v, "data.frame" = df, "matrix" = A)
```

**Referencing Components** To access an individual element of a list, use double square brackets, specifying either the element number or name. Note that this allows access to only one element at a time.

```
L1[[1]]
## [1] 4.6 4.8 5.2 6.3 6.8
L1[[2]]
## v1 v2
## 1 UT A
## 2 IA B
## 3 MN A
## 4 AL A
## 5 AL B
## 6 IA B
## 7 MN A
## 8 MN A
class(L1[[1]])
## [1] "numeric"
class(L1[1])
## [1] "list"
```

It is possible to append elements to a list simply by using an index (either numeric or named) that does not exist.

```
# see how long the currently list is
length(L1)
## [1] 3
# add a fourth element,
L1[[4]] <- 4
length(L1)
## [1] 4
L1
## $vector
## [1] 4.6 4.8 5.2 6.3 6.8
##
## $data.frame
## v1 v2
## 1 UT A
## 2 IA B
## 3 MN A
## 4 AL A
## 5 AL B
## 6 IA B
## 7 MN A
## 8 MN A
##
## $matrix
## [,1] [,2]
## [1,] 1 6
## [2,] 2 7
## [3,] 3 8
## [4,] 4 9
## [5,] 5 10
##
## [[4]]
## [1] 4
# add a fifth element,
L1[["NewElement"]] <- 2:7
length(L1)
## [1] 5
L1 # note that in this case output is different
## $vector
## [1] 4.6 4.8 5.2 6.3 6.8
##
## $data.frame
## v1 v2
## 1 UT A
## 2 IA B
## 3 MN A
## 4 AL A
## 5 AL B
## 6 IA B
## 7 MN A
## 8 MN A
##
## $matrix
## [,1] [,2]
## [1,] 1 6
## [2,] 2 7
## [3,] 3 8
## [4,] 4 9
## [5,] 5 10
##
## [[4]]
## [1] 4
##
## $NewElement
## [1] 2 3 4 5 6 7
```

**Example.** Suppose that I teach three sections of the same statistics course, each with a different number of students and the final grades look like the following:

```
section1 <- c(61.4, 63.0, 66.6, 74.8, 71.8, 63.2, 72.3, 61.9, 70.0)
section2 <- c(79.6, 70.2, 67.5, 75.5, 68.2, 81.0, 69.6, 75.6,69.5, 72.4, 77.1)
section3 <- c(74.9, 81.9, 80.3, 79.5, 77.3, 92.7, 76.4, 82.0, 68.9,77.6)
```

Form a list:

```
allSections <- list(section1, section2, section3)
allSections
## [[1]]
## [1] 61.4 63.0 66.6 74.8 71.8 63.2 72.3 61.9 70.0
##
## [[2]]
## [1] 79.6 70.2 67.5 75.5 68.2 81.0 69.6 75.6 69.5 72.4 77.1
##
## [[3]]
## [1] 74.9 81.9 80.3 79.5 77.3 92.7 76.4 82.0 68.9 77.6
```

Some statistics on the list:

```
sectionMeans <- sapply(allSections, mean)
sectionMeans
## [1] 67.22222 73.29091 79.15000
sectionStdev <- round(sapply(allSections, mean), 2)
sectionStdev
## [1] 67.22 73.29 79.15
```

We combined the three classes into a list and then used the `sapply`

function to find the means and standard deviations for the three classes. The `sapply`

function produces a simplified view of the means and standard deviations. Note that the `lapply`

function works here as well, as the calculation of the variances for the separate sections shows, but produces a different kind of output from that of `sapply`

, making it clear that the output is yet another list:

```
lapply(allSections, var)
## [[1]]
## [1] 26.06194
##
## [[2]]
## [1] 22.17491
##
## [[3]]
## [1] 37.47167
```