# Chapter 3 Data Types

In this part we are going to see different data types that are used in `R`

and some basic operations on these data types.

Just to get the language right, everything we manipulate or encounter in `R`

are called **objects**. For example, typing the expression `x <- 2`

creates the object `x`

, and `y <- "hello"`

creates another object `y`

containing the word “hello”, etc. So objects can contain different kinds of data.

`R`

has five very low level of objects which are called *basic atomic classes* of objects:

numeric data (real numbers)

integers

character

complex numbers

logicals (True/False)

The type of data contained in an object is checked with **class** function:

```
x <- 2
class(x)
[1] "numeric"
```

## 3.1 Numeric Data

Numbers in `R`

are generally treated as numeric objects (i.e. double precision real numbers). Numeric data handles both integers and decimals.

```
x = 3.5 # assign a value
x # print the value of x
[1] 3.5
class(x) # print the class name of x
[1] "numeric"
```

Even if we assign an integer to a variable k, it is still being saved as a numeric value.

```
x = 1
x # print the value of x
[1] 1
class(x) # print the class name of x
[1] "numeric"
```

There is also a special number `Inf`

which represents infinity; e.g. `1/0`

gives `inf`

. `Inf`

can be used in ordinary calculations. For example, `1/Inf`

is 0.

The value `NaN`

represents an undefined value (“not a number”), e.g. `0/0`

would produce `NaN`

. `NaN`

can also be thought of as a missing value. We are going to say more on that later.

## 3.2 Integers

A numeric value stored in a variable is *automatically* assumed to be `numeric`

, so if you explicitly want an `integer`

type, you need to specify the `L`

suffix. For example, entering `1`

gives you a `numeric`

object, on the other hand entering `1L`

explicitly gives you an `integer`

.

Testing whether a variable is numeric is done with the function **is.numeric**

```
x <- 2.3
is.integer(x)
[1] FALSE
is.numeric(x)
[1] TRUE
x <- 2L
is.integer(x)
[1] TRUE
```

Another way of creating an `integer`

data is to use `as.integer`

function

```
y = as.integer(3)
y # print the value of y
[1] 3
class(y) # print the class name of y
[1]"integer"
is.integer(y) # is y an integer?
[1] TRUE
```

## 3.3 Characters

A character object is used to represent string values in `R`

.

```
x <- "hello"
x
[1] "hello"
```

We convert objects into character values with the `as.character`

function

To find the length of a `character`

or `numeric`

data we use the `nchar`

function:

```
x <- "value"
nchar(x)
[1] 5
x<- 1881
nchar(x)
[1] 4
```

Another example:

```
vec_char <- c("My", "Great", "Title")
vec_char
## [1] "My" "Great" "Title"
length(vec_char)
## [1] 3
paste(vec_char, collapse = " ")
## [1] "My Great Title"
myGreatTitle <- paste(vec_char, collapse = " ")
myGreatTitle <- c(myGreatTitle, ": yet to come!")
paste(myGreatTitle, collapse = " ")
## [1] "My Great Title : yet to come!"
```

Using the function `paste()`

we can join two character vectors that are each of length 1:

```
paste("My", "great", "title", sep = " ")
## [1] "My great title"
length(paste("My", "great", "title", sep = " "))
## [1] 1
```

or we can join two vectors of length greater than 1:

```
paste(c("My", "great", "title"), 1:3, sep = "")
## [1] "My1" "great2" "title3"
```

When the vectors are not of equal length then `R`

recycles the shorter vector until it matches the length of the longer one:

```
paste(letters, 1:5, sep = "-")
## [1] "a-1" "b-2" "c-3" "d-4" "e-5" "f-1" "g-2" "h-3" "i-4" "j-5" "k-1"
## [12] "l-2" "m-3" "n-4" "o-5" "p-1" "q-2" "r-3" "s-4" "t-5" "u-1" "v-2"
## [23] "w-3" "x-4" "y-5" "z-1"
```

It is also worth noting that the numeric vector 1:5 gets coerced into a character vector by the `paste()`

function. We will talk more about coercion later.

## 3.4 Complex Numbers

As the name suggests this data type handles complex numbers. In this book, we are not going to encounter with complex numbers oftern (unless you do something wrong!) so I am not going to go into the detail of this data type.

```
x <- 2 + 3i
x
[1] 2+3i
```

Note that it is not `x <- 2 + 3*i`

.

## 3.5 Logical

Logical data represents data that can be either `TRUE`

or `FALSE`

.

```
x <- TRUE # logical
class(x)
[1] "logical"
is.logical(x) # logicals have their own test function.
[1] TRUE
as.numeric(TRUE) # numeric
[1] 1
```

Numerically, `TRUE`

is the same as `1`

and `FALSE`

is the same as `0`

.

```
x <- TRUE * 2
x
[1] 2
y <- FALSE + 3
y
[1] 3
```

Note that when mixing with `numeric`

data `logical`

data automatically treated as `numeric`

.

Logicals can result from comparing two numbers, characters or conditions. Main operators that produce logical data are summerized in the table below.

**R’s Comparison and Logical Operators:**

Operator/Function |
R Command |
Example |
Output |
---|---|---|---|

Equality | `==` |
`2==3` |
FALSE |

Not equal | `!=` |
`2!=3` |
TRUE |

Negation | `!()` |
`!(2==3)` |
TRUE |

Greater than | `>` |
`3>2` |
TRUE |

Less than | `<` |
`3<2` |
FALSE |

Greater than or equal | `>=` |
`3>=2` |
TRUE |

Less than or equal | `<=` |
`3<=2` |
FALSE |

And | `&` |
`(3<=2)&(5>3)` |
FALSE |

Or | `|` |
`(3<=2)|(5>3)` |
TRUE |

Note that the equality operator `==`

is different from the usual equality operator `=`

. As with the mathematical operators and the logical operators are also vectorized.

Some examples:

```
2 > 3
## [1] FALSE
2 != 3
## [1] TRUE
x <- 1:5
y <- 5:1
x >= y
## [1] FALSE FALSE TRUE TRUE TRUE
"value" == "home"
## [1] FALSE
"value" > "house"
## [1] TRUE
```

## 3.6 and Dates

Dates are not really one of the atomic objects but I am going to present them in this section as another type of data in `R`

.

There are two main formats for `dates`

data: `Date`

and `POSIXct`

. `Date`

stores just a date while `POSIXct`

stores a date and time. Both objects are actually represented as the number of days (Date) or seconds (POSIXct) since January 1, 1970.

As I mention above dates is not a separate data type. For example, if we express a date as `"2015-07-20"`

it is a `character`

data and we use `as.Date`

function to convert `character`

to `dates`

:

```
x <- as.Date("2015-07-20")
# number of days between 01/01/1970 and 07/20/2015
x
## [1] "2015-07-20"
class(x)
## [1] "Date"
as.numeric(x)
## [1] 16636
```

Compare the result with this:

```
x <- as.POSIXct("2015-07-20 12:00")
# number of seconds between 01/01/1970 00:00 and 07/20/2015 12:00
x
## [1] "2015-07-20 12:00:00 EDT"
class(x)
## [1] "POSIXct" "POSIXt"
as.numeric(x)
## [1] 1437408000
```

General format of `as.Date()`

function is

`as.Date(x, "input_format")`

where `x`

is the character data and `input_format`

gives the appropriate format for reading the date. Date formats are presented in the following table.

Symbol |
Meaning________________ |
Example |
---|---|---|

`%d` |
Day as a number (0-31) | 01-31 |

`%a` |
Abbrevated weekday | Mon |

`%A` |
Unabbrevated weekday | Monday |

`%m` |
Month (00-12) | 00-12 |

`%b` |
Abbrevated month | Jan |

`%B` |
Unabbrevated month | January |

`%y` |
Two-digit year | 19 |

`%Y` |
Four-digit year | 2019 |

The default format for inputting dates is `yyyy-mm-dd`

. The statement

```
mydates <- as.Date(c("2007-06-22", "2004-02-13"))
mydates
```

`## [1] "2007-06-22" "2004-02-13"`

converts the character data to dates using this default format. In contrast,

```
strDates <- c("01/05/1965", "08/16/1975")
mydates <- as.Date(strDates, "%m/%d/%Y")
mydates
```

`## [1] "1965-01-05" "1975-08-16"`

reads the data using a `mm/dd/yyyy`

format.

Once the variable is in date format, you can analyze and plot the dates using the wide range of analytic techniques covered in later chapters.

Two functions are especially useful for time-stamping data. `Sys.Date()`

returns today’s date, and `date()`

returns the current date and time.

You can use the format(x, format=“output_format”) function to output dates in a specified format and to extract portions of dates:

```
today <- Sys.Date()
format(today, format="%B %d %Y")
## [1] "February 04 2019"
format(today, format="%A")
## [1] "Monday"
```

When `R`

stores dates internally, they’re represented as the number of days since January 1, 1970, with negative values for earlier dates. That means you can perform arithmetic operations on them. For example,

```
startdate <- as.Date("2004-02-13")
enddate <- as.Date("2011-01-22")
days <- enddate - startdate
days
## Time difference of 2535 days
```

displays the number of days between February 13, 2004 and January 22, 2011.

**Exercise (Dates)** Discuss what happens when instead of using `as.Date`

you accidentally saved as

`x <- "2015-07-20"`

How to convert a numeric value to a date?