Variables in R

Variables are Buckets

Variables are buckets that hold information.

A variable is a symbolic name for (or reference to) information. Variables in computer programming are analogous to “buckets”, where information can be maintained and referenced. On the outside of the bucket is a name. When referring to the bucket, we use the name of the bucket, not the data stored in the bucket.

An example > x<-3

In the example above, we created a variable or a ‘bucket’ called x. Inside we put a value. Let’s create another variable called y and give it a value of 5. When assigning a value to an variable, R does not print anything to the console. You can force to print the value by using parentheses or by typing the name.

Other examples:
> y<-5
> x+y
[1] 8
> s<-x+y

Basic Data Types

Numeric

Decimal values are called numerics in R. It is the default computational data type. If we assign a decimal value to a variable x as follows, x will be of numeric type.

> x = 10.5 # assign a decimal value
> x # print the value of x
[1] 10.5
> class(x) # print the class name of x
[1] "numeric"

Furthermore, even if we assign an integer to a variable k, it is still being saved as a numeric value.

The fact that k is not an integer can be further confirmed with the is.integer function. We will discuss how to create an integer in our next tutorial on the integer type.

Integer

In order to create an integer variable in R, we invoke the as.integer function. We can be assured that y is indeed an integer by applying the is.integer function.

We can coerce a numeric value into an integer with the same as.integer function.

And we can parse a string for decimal values in much the same way.

On the other hand, it is erroneous trying to parse a non-decimal string.

Often, it is useful to perform arithmetic on logical values. Like the C language, TRUE has the value 1, while FALSE has value 0.

Character

Logical

Complex

Basic Data Structures

Vectors

  • Vectors are a collection of numbers or characters or both

  • Vectors are the most common and basic data structure in R, and they are the workhorse of R

  • The analogy is a bucket with different compartments; Each compartment is called an element

  • Each element contains a single value

  • There is no limit to the number of elements

  • The vector is assigned to a single variable, because regardless of how many elements it contains it is still a single bucket

  • We create a vector named V shown in the image on the left hand side: V<-c(1,2,3)

Each element of the vector contains a single numeric value, and three values will be combined together using c() (the combine function). All of the values are put within the parentheses and separated with a comma.

Factors

  • Factors are used to represent categorical data

  • Factors can be ordered or unordered

  • Factors are an important class for statistical analysis and for plotting

  • Factors are stored as integers, and have labels associated with these unique integers

  • While factors look (and often behave) like character vectors, they are actually integers under the hood

  • You need to be careful when treating them like string

  • To create a factor vector we use the factor() function:

Matrix

  • A matrix in R is a collection of vectors of same length and identical datatype

  • Vectors can be combined as columns in the matrix or by row

  • Usually matrices are numeric and used in various computational algorithms to serve as a checkpoint

  • If input data is not of identical data type (numeric, character, etc.), the matrix() function will throw an error and stop any downstream code execution.

DataFrames

  • A data.frame is a collection of vectors of identical lengths

  • each vector can be of a different data type (e.g., characters, integers, factors)

  • data.frame is the de facto data structure for most tabular data, and each vector represents a column

  • data.frame is commonly used for statistics and plotting

Lists

  • A list is a collection of data structures

  • There is no particular restriction for the components to be of the same mode or type

  • For example, a list could consist of a numeric vector, a logical value, a matrix, a complex vector, a character array, a function, and so on

Last updated

Was this helpful?