R Programming Language (Analysis Software for Statistics & Data Science)
R is a programming language and software that is becoming increasingly popular in the disciplines of statistics and data science.
R is a dialect of the S programming language and was developed by Ross Ihaka and Robert Gentleman in the year 1995. A stable beta version was released in the year 2000.
The R software is completely free and gets developed collaboratively by its community (open source software) – every R user can publish new add-on packages.
The open source ideology of R programming reflects a huge contrast compared to most traditional programming environments (e.g. SAS, SPSS, Stata etc.), where the software development is in the hands of a payed development team.
All R Tutorials on statistical-programming.com
In the following, you can find a list of R tutorials on statistical-programming.com. In the tutorials, I’m explaining statistical concepts and provide reproducible example codes in R.
asp in R Plot (2 Example Codes) | Set Aspect Ratio of Scatterplot & Barplot
attr, attributes & structure Functions in R | 4 Examples (get, remove & set)
cbind R Command | 3 Example Codes (Data Frame, Vector & Multiple Columns)
colSums, rowSums, colMeans & rowMeans in R | 5 Example Codes + Video
Complete Cases in R (3 Programming Examples)
cumsum R Function Explained (Example for Vector, Data Frame, by Group & Graph)
dir R Function | 3 Example Codes
droplevels R Example | How to Drop Factor Levels of Vector & Data Frame
How to Convert a Character to Numeric in R
How to Convert a Factor to Numeric in R
How to Rename a Column Name in R | 3 Examples to Change Colnames of a Data Frame
lowess() R Smoothing Function | 2 Example Codes for Normalization by Lowess Regression
NA Omit in R | 3 Example Codes for na.omit (Data Frame, Vector & by Column)
R Find Missing Values (6 Examples for Data Frame, Column & Vector)
R Functions List (+ Examples) | All Basic Commands of the R Programming Language
R is.na Function Example (remove, replace, count, if else, is not NA)
R max and min Functions | 8 Examples: Remove NA Value, Two Vectors, Column & Row
R NA – What are <Not Available> Values?
R outer Function | 4 Example Codes (Basic Application & User Defined)
R pairs & ggpairs Plot Functions | 5 Example Codes (Color, Labels, Panels & by Group)
R polygon Function | 6 Example Codes (Frequency & Density Plot)
R pretty Function | 3 Example Codes (Interval Sequence & Set Axis Labels of Plot)
R Replace Last Comma of Character with &-Sign (5 Examples)
R Replace NA with 0 (10 Examples for Data Frame, Vector & Column)
R substr & substring Functions | Examples: Remove, Replace, Match in String
R sweep Function | 3 Example Codes (Matrix Operation with MARGIN & STATS)
R union Function | 3 Example Codes (Two Vectors, Data Frames & Lists)
R unlist Function | 3 Example Codes (List of Vectors, Data Frame & String)
rbind in R | 3 Examples (Vector, Data Frame & rbind.fill for Missing Columns)
readLines, n.readLines & readline in R (6 Example Codes)
rev R Function | 3 Examples (Reverse of Vector, Data Frame by Column & by Row)
setNames vs. setnames in R (+ Examples) | stats & data.table Package
strptime & strftime in R | 5 Example Codes (How to Set Year, Day, Hour & Time Zone)
The all & any R Functions | 4 Example Codes
The difftime R Function | 3 Examples (Return Time Difference in Days, Seconds or Weeks)
The dim Function in R (4 Examples)
The is.null Function in R (4 Examples)
The jitter R Function | 3 Example Codes (Basic Application & Boxplot Visualization)
The length Function in R (3 Examples for Vector, List & String)
The nchar R Function | 3 Examples (String, Vector & Error: nchar Requires a Character)
The ncol Function in R (3 Examples)
The nrow Function in R (4 Examples)
The pmax and pmin R Functions | 3 Examples (How to Handle Warnings & NA)
The segments R Function | 3 Example Codes
The setdiff R Function (3 Example Codes)
The Increasing Popularity of R Programming
Since the R programming language provides features for almost all statistical tasks without any costs for the user, R is rapidly growing since its release. Let’s check some numbers…
Graphic 1: Google Scholar Search Results for R Programming Filtered by Year
Reasons to Learn R
The pros:
+ R is free
+ R’s popularity is growing – More and more people will use it
+ Almost all statistical methods are available in R
+ New methods are implemented in add-on packages quickly
+ Algorithms for packages and functions are publicly available (transparency and reproducibility)
+ R provides a huge variety of graphical outputs
+ R is very flexible – Essentially everything can be modified for your personal needs
+ R is compatible with all operating systems (e.g. Windows, MAC, or Linux)
+ R has a huge community that is organized in forums to help each other (e.g. stackoverflow)
+ R is fun 🙂
The cons:
– Relatively high learning burden at the beginning (even though it’s worth it)
– No systematic validation of new packages and functions
– No company in the background that takes responsibility for errors in the code (this is especially important for public institutes)
– R is almost exclusively based on programming (no extensive drop-down menus such as in SPSS)
– R can have problems with computationally intensive tasks (only important for advanced users)
You are not sure yet, whether you should learn the R programming language? In that case, I can recommend the following video of the YouTube channel RenegadeThinking. The speaker provides you with many reasons, why it is advisable to learn R.
Appendix
Appendix 1: R code for the creation of Graphic 1
year <- 2018:2000 # Years r_gs <- c(21600 * 2, 43300, 43100, 38100, 33200, 29800, # Google Scholar searches 28500, 25500, 22400, 19100, 15900, 12000, 8270, 5930, 3740, 2600, 1980, 1600, 1360) data <- data.frame(software = rep("R", 19), # Combine data year = year, searches = r_gs) ggplot(data) + # Create plot geom_point(aes(x = year, y = searches, color = software, shape = software)) + geom_line(aes(x = year, y = searches, color = software)) + theme(legend.title = element_blank(), legend.position = "none") + ggtitle("Google Scholar Search Results") + labs(x = "Year", y = "Search Results") + scale_y_continuous(labels = comma) |
Appendix 2: How to create the header graphic of this page
par(mar = c(0, 0, 0, 0)) # Remove space around plot par(bg = "#1b98e0") # Set background color set.seed(10293847) # Seed N <- 100000 # Sample size x <- rnorm(N) # X variable y <- rnorm(N) + x # Correlated Y variable plot(x, y, col = "#353436", pch = 19, cex = 0.1 # Create plot , xlim = c(- 4, 4), ylim = c(- 7, 7)) text(0, 0, "R", col = "#1b98e0", cex = 12) # Write R points(0, 0, col = "#1b98e0", cex = 30, lwd = 5) # Create circles points(0, 0, col = "#1b98e0", cex = 50, lwd = 5) points(0, 0, col = "#1b98e0", cex = 70, lwd = 5) points(0, 0, col = "#1b98e0", cex = 90, lwd = 5) points(0, 0, col = "#1b98e0", cex = 110, lwd = 5) box(col="#1b98e0") # Color of box |