diff Function in R (2 Examples) | How to Calculate the Difference in R
In this article, I’ll explain how to calculate differences of a vector with the diff function in R. Let’s first have a look at the basic R syntax and the definition of diff:
Basic R Syntax of diff():
Definition of diff():
The diff function computes the difference between pairs of consecutive elements of a numeric vector.
In the following, I’ll show you two examples for the application of diff in the R programming language. So without further ado, let’s move on to the examples.
Example 1: diff Function With Default Specifications
The diff function is usually applied to a numeric vector, array, or column of a data frame. So let’s create such a vector first:
x <- c(5, 2, 10, 1, 3) # Create example vector
Our example vector contains five values between 1 and 3. Now let’s use the diff command to compute the difference of each consecutive value of this vector:
diff(x) # Apply diff in R # -3 8 -9 2
So what happened here? The diff function did four separate calculations:
- 2 – 5 = – 3
- 10 – 2 = 8
- 1 – 10 = – 9
- 3 – 1 = 2
The R diff function subtracted the first value from the second, the second value from the third, the third value from the fourth, and the fourth value from the fifth. In other words: diff returned the first lag to the RStudio console.
Could we also calculate a bigger lag? Yes of cause, and that’s what I’m going to show you next!
Example 2: diff Function With Lag Larger Than 1
The diff function provides the option “lag”. The default specification of this option is 1, as we have seen in Example 1. A perfect option in case we are dealing with time series data.
If we want to increase the size of the lag, we can specify the lag option within the diff command as follows:
diff(x, lag = 2) # Apply diff with lag # 5 -1 -7
In this example, we are using a lag of 2. In the following figure, you can see how this output is computed:
Figure 1: Calculations of diff Function with Lag of Two.
Alternative R Functions for the Calculation of Differences
The diff Function is by far not the only R function that computes differences of data objects. It makes a lot of sense to explore other difference-functions as well, to be able to decide from situation to situation which functions suits your need the most.
To give you some examples: I can recommend to have a look at functions such as difftime for the calculation of time differences; setdiff for the identification of elements of a data object A that are not existent in a data object B; or sweep which applies an operation such as minus to a data matrix by row or by column.
If you want to learn more about the computation of differences in R, you could also have a look at the following video tutorial of the YouTube channel Xperimental Learning. In the video, the speaker explains how to use the setdiff function. Have fun with the video and let me know in the comments which difference functions you like the most!