readLines, n.readLines & readline in R (6 Example Codes)

 

In this tutorial, I’m going to show you how to read text by line with three different R functions:

  1. readLines (Examples 1-4)
  2. n.readLines (Example 5)
  3. readline (Example 6)

Let’s start with the basic R syntax of these three functions and some definitions:

Basic R Syntax:

readLines("path/filename.txt")
 
n.readLines("path/filename.txt" , n = 5, skip = 2)
 
readline("question")

 

The readLines function reads text lines from an input file.

The n.readLines function of the reader package provides additional functionalities for reading lines, such as skipping ahead in a file or ignoring comments and headers.

The readline function interactively reads a line from the terminal.

In order to get a bit more concrete, let’s move on to the examples…

 

Example 1: Read Lines of txt File via readLines R Function

When you have to do text mining / text analysis of larger texts, you will typically be provided with relatively unstructured .txt files.

The readLines function is perfect for such text files, since it reads the text line by line and creates character objects for each of the lines.

For the first example, I’m going to create a simple txt file that we can use for the application of readLines. In case you want to reproduce the example, simply copy and paste the following code.

So, let’s first store the directory, where we want to store and load our example data…

# Store currently used directory
path <- getwd()

…and then let’s create a txt file in this directory:

# Write example text to currently used directory
write.table(x = print("this is the first line\nthis is the second line\nthis is the third line"),
            file = paste(path, "/my_txt.txt", sep = ""),
            row.names = FALSE, col.names = FALSE, quote = FALSE)

If you run this code on your computer, there should be a new txt file in the folder that is currently used by R (check the folder location via getwd()). The txt file looks as follows:

 

txt File with Several Text Lines

Figure 1: Text File for the Application of readLines().

 

Now, we can apply the R readLines command to this text file:

# Apply readLines function to txt file
my_txt <- readLines(paste(path, "/my_txt.txt", sep = ""))
my_txt
# "this is the first line"  "this is the second line" "this is the third line"

The output of the function is a vector that contains 3 character strings, i.e. this is the first line, this is the second line, and this is the third line.

As you can see, we read the whole txt file into R. Easy – But what if we want to read only certain lines from our text file?

 

Example 2: Read First n Lines Only

Quite often you will be interested in the first n lines of your input file. Fortunately the readLines R function provides an n-option, which lets you specify the number of lines to read.

We can simply adjust our code as follows…

# Apply readLines function to first two lines
my_txt_ex2 <- readLines(paste(path, "/my_txt.txt", sep = ""),
                        n = 2)
my_txt_ex2
# "this is the first line"  "this is the second line"

…in order to read only the first two lines of our example file.

Looks good. However, so far we have only used .txt files. What about other file-types?

 

Example 3: readLines from CSV File into R

In this example, I’m going to use the readLines R function to read a data frame that is stored in a .csv file.

Let’s first create an example file in our currently used directory:

# Write example csv to currently used directory
write.csv(iris,
          paste(path, "/iris.csv", sep = ""),
          quote = FALSE)

If you have a look at the currently used folder on your computer, you will find the Iris data set. The first few rows of the data look as follows:

 

nrow function in R - Iris Example Data Frame

Table 1: First 6 Rows of Iris Data Set.

 

We can apply the readLines function to this csv as we did before:

# Apply readLines function to csv file
iris_data <- readLines(paste(path, "/iris.csv", sep = ""),
                       n = 4)
iris_data
# [1] ",Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species" "1,5.1,3.5,1.4,0.2,setosa"
# [3] "2,4.9,3,1.4,0.2,setosa"                                     "3,4.7,3.2,1.3,0.2,setosa"

readLines returns a character object for each row of the data frame, whereby columns are separated by commas.

 

Example 4: readLines from xlsx Excel File into R?!

In the previous Example, I have shown you how to read csv Excel files with the readLines function. Now you might ask: is it also possible to use readLines for xlsx Excel files?

Answer: As far as I know, it is not.

Fortunately, there is an easy work-around in case you want to apply readLines to xlsx files – Just convert your xlsx file to csv!

I’m going to show you how:

First, we need to install and load the xlsx R package:

# Install and load xlsx package
install.packages("xlsx")
library("xlsx")

Then, we can use the write.xlsx function to create an xlsx file for our example (we are using the iris data set again):

# Write example xlsx to currently used directory
write.xlsx(iris,
           paste(path, "/iris_xlsx.xlsx", sep = ""),
           row.names = FALSE)

At this point you should have an xlsx file with the name iris_xlsx in your working directory.

Now, we can apply the following R code in order to convert the xlsx file to csv:

# Convert xlsx to csv
iris_xlsx <- read.xlsx2(paste(path, "/iris_xlsx.xlsx", sep = ""),
                        sheetIndex = 1)
write.csv2(iris_xlsx,
           paste(path, "/iris_converted.csv", sep = ""))

At this point you should have a csv file with the name iris_converted in your working directory.

After the conversion, you can simply apply readLines, as I have shown you in Example 3.

Easy breezy!

 

Example 5: Skip First Lines via n.readLines [reader Package]

Another quite common scenario is that you are interested of some lines within your text, i.e. you want to skip the first n lines and eventually also the last n lines.

Fortunately, the R package reader provides such options. Let’s first install and load the package:

# Install and load reader R package
install.packages("reader")
library("reader")

We could also use the n.readLines function to produce the same output as we did with readLines of base R in Example 1:

# Apply n.readLines function
n.readLines(paste(path, "/my_txt.txt", sep = ""),
            header = FALSE,
            n = 3)
# "this is the first line"  "this is the second line" "this is the third line"

However, the n.readLines function provides an additional skip-option:

# Apply n.readLines function with skip option
n.readLines(paste(path, "/my_txt.txt", sep = ""),
            header = FALSE,
            n = 2,
            skip = 1)
# "this is the second line" "this is the third line"

We have used n = 2 in order to print 2 lines and we have specified skip = 1 in order to skip the first line.

 

Example 6 (Video): readLines vs. readline – What’s the difference?

Often confusing: Base R provides a function that is called readLines (with upper case L and an s at the end) and a function that is called readline (all in lower case and no s at the end).

Even though both functions are related to each other, they are used for different situations. While readLines is used to read the lines of an input file, readline is used to read the input of the R user interactively (typically by asking questions to the user in the RStudio console).

It is difficult to explain an interactive function in a written tutorial. However, fortunately the YouTube channel Docworld Academy has created a simple video on the usage of readline in R.

Have fun with the video and let me know in the comments, in case you have any questions.

 

 

Further Reading

 



 

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu