# R is.na Function Example (remove, replace, count, if else, is not NA)

Well, I guess it goes without saying that NA values decrease the quality of our data.

Fortunately, the R programming language provides us with a function that helps us to deal with such missing data: **the is.na function**.

In the following article, I’m going to explain **what the function does** and how the function can be **applied in practice**.

Let’s dive in…

## The is.na Function in R (Basics)

Before we can start, let’s create some **example data** in R (or R Studio).

set.seed(951) # Set seed N <- 1000 # Sample size x_num <- round(rnorm(N, 0, 5)) # Numeric variable x_fac <- as.factor(round(runif(N, 0, 3))) # Factor variable x_cha <- sample(letters, N, replace = TRUE) # Character variable data <- data.frame(x_num, x_fac, x_cha) # Create data.frame |

set.seed(951) # Set seed N <- 1000 # Sample size x_num <- round(rnorm(N, 0, 5)) # Numeric variable x_fac <- as.factor(round(runif(N, 0, 3))) # Factor variable x_cha <- sample(letters, N, replace = TRUE) # Character variable data <- data.frame(x_num, x_fac, x_cha) # Create data.frame

Our data consists of three columns, each of them with a different class: numeric, factor, and character. This is how the first six lines of our data look like:

**Table 1: Example Data for the is.na R Function (First 6 Rows)**

Let’s apply the is.na function to our **whole data set**:

is.na(data) # x_num x_fac x_cha # [1,] FALSE FALSE FALSE # [2,] FALSE FALSE TRUE # [3,] FALSE FALSE FALSE # [4,] TRUE TRUE FALSE # [5,] TRUE TRUE FALSE # [6,] FALSE FALSE FALSE # ... |

is.na(data) # x_num x_fac x_cha # [1,] FALSE FALSE FALSE # [2,] FALSE FALSE TRUE # [3,] FALSE FALSE FALSE # [4,] TRUE TRUE FALSE # [5,] TRUE TRUE FALSE # [6,] FALSE FALSE FALSE # ...

The function produces a matrix, consisting of **logical values** (i.e. TRUE or FALSE), whereby TRUE indicates a missing value. Compare the output with the data table above — The TRUE values are at the same position as before the NA elements.

An important feature of is.na is that the function can be **reversed** by simply putting a ! (exclamation mark) in front. In this case, TRUE indicates a value that is not NA in R:

!is.na(data) # x_num x_fac x_cha # [1,] TRUE TRUE TRUE # [2,] TRUE TRUE FALSE # [3,] TRUE TRUE TRUE # [4,] FALSE FALSE TRUE # [5,] FALSE FALSE TRUE # [6,] TRUE TRUE TRUE # ... |

!is.na(data) # x_num x_fac x_cha # [1,] TRUE TRUE TRUE # [2,] TRUE TRUE FALSE # [3,] TRUE TRUE TRUE # [4,] FALSE FALSE TRUE # [5,] FALSE FALSE TRUE # [6,] TRUE TRUE TRUE # ...

Exactly the opposite output as before!

We are also able to check whether there is or is not an NA value in a **column or vector**:

is.na(data$x_num) # Works for numeric ... is.na(data$x_fac) # ... factor ... is.na(data$x_cha) # ... and character !is.na(data$x_num) # The explanation mark still works !is.na(data$x_fac) !is.na(data$x_cha) |

is.na(data$x_num) # Works for numeric ... is.na(data$x_fac) # ... factor ... is.na(data$x_cha) # ... and character !is.na(data$x_num) # The explanation mark still works !is.na(data$x_fac) !is.na(data$x_cha)

As you have seen, is.na provides us with logical values that show us whether a value is NA or not. We can apply the function to a whole database or to a column (no matter which class the vector has).

That’s nice, but the real power of is.na becomes visible in **combination with other functions** — And that’s exactly what I’m going to show you now.

**On a side note:**

R provides several other is.xxx functions that are very similar to is.na (e.g. is.nan, is.null, or is.finite). Stay tuned — All you learn here can be applied to many different programming scenarios!

## is.na in Combination with Other R Functions

In the following, I have prepared examples for the most important R functions that can be combined with is.na.

### Remove NAs of Vector or Column

In a vector or column, NA values can be removed as follows:

is.na_remove <- data$x_num[!is.na(data$x_num)] |

is.na_remove <- data$x_num[!is.na(data$x_num)]

Note: Our new vector is.na_remove is shorter in comparison to the original column data$x_num, since we use a filter that deletes all missing values.

If you want to drop rows with missing values of a data frame (i.e. of multiple columns), the complete.cases function is preferable. Learn more…

### Replace NAs with Other Values

Based on is.na, it is possible to replace NAs with other values such as zero…

is.na_replace_0 <- data$x_num # Duplicate first column is.na_replace_0[is.na(is.na_replace_0)] <- 0 # Replace by 0 |

is.na_replace_0 <- data$x_num # Duplicate first column is.na_replace_0[is.na(is.na_replace_0)] <- 0 # Replace by 0

…or the mean.

is.na_replace_mean <- data$x_num # Duplicate first column x_num_mean <- mean(is.na_replace_mean, na.rm = TRUE) # Calculate mean is.na_replace_mean[is.na(is.na_replace_mean)] <- x_num_mean # Replace by mean |

is.na_replace_mean <- data$x_num # Duplicate first column x_num_mean <- mean(is.na_replace_mean, na.rm = TRUE) # Calculate mean is.na_replace_mean[is.na(is.na_replace_mean)] <- x_num_mean # Replace by mean

In case of characters or factors, it is also possible in R to set NA to blank:

is.na_blank_cha <- data$x_cha # Duplicate character column is.na_blank_cha[is.na(is.na_blank_cha)] <- "" # Class character to blank is.na_blank_fac <- data$x_fac # Duplicate factor column is.na_blank_fac <- as.character(is.na_blank_fac) # Convert temporarily to character is.na_blank_fac[is.na(is.na_blank_fac)] <- "" # Class character to blank is.na_blank_fac <- as.factor(is.na_blank_fac) # Recode back to factor |

is.na_blank_cha <- data$x_cha # Duplicate character column is.na_blank_cha[is.na(is.na_blank_cha)] <- "" # Class character to blank is.na_blank_fac <- data$x_fac # Duplicate factor column is.na_blank_fac <- as.character(is.na_blank_fac) # Convert temporarily to character is.na_blank_fac[is.na(is.na_blank_fac)] <- "" # Class character to blank is.na_blank_fac <- as.factor(is.na_blank_fac) # Recode back to factor

### Count NAs via sum & colSums

Combined with the R function sum, we can count the amount of NAs in our columns. According to our previous data generation, it should be approximately 20% in x_num, 30% in x_fac, and 5% in x_cha.

sum(is.na(data$x_num)) # 213 missings in the first column sum(is.na(data$x_fac)) # 322 missings in the second column sum(is.na(data$x_cha)) # 47 missings in the third column |

sum(is.na(data$x_num)) # 213 missings in the first column sum(is.na(data$x_fac)) # 322 missings in the second column sum(is.na(data$x_cha)) # 47 missings in the third column

If we want to count NAs in multiple columns at the same time, we can use the function colSums:

colSums(is.na(data)) # x_num x_fac x_cha # 213 322 47 |

colSums(is.na(data)) # x_num x_fac x_cha # 213 322 47

### Detect if there are any NAs

We can also test, if there is at least 1 missing value in a column of our data. As we already know, it is TRUE that our columns have NAs.

any(is.na(data$x_num)) # [1] TRUE |

any(is.na(data$x_num)) # [1] TRUE

### Locate NAs via which

In combination with the which function, is.na can be used to identify the positioning of NAs:

which(is.na(data$x_num)) # [1] 4 5 14 17 22 23... |

which(is.na(data$x_num)) # [1] 4 5 14 17 22 23...

Our first column has missing values at the positions 4, 5, 14, 17, 22, 23 and so forth.

### if & ifelse

Missing values have to be considered in our programming routines, e.g. within the if statement or within for loops.

In the following example, I’m printing *“Damn, it’s NA”* to the R Studio console whenever a missing occurs; and *“Wow, that’s awesome”* in case of an observed value.

for(i in 1:length(data$x_num)) { if(is.na(data$x_num[i])) { print("Damn, it's NA") } else { print("Wow, that's awesome") } } # [1] "Wow, that's awesome" # [1] "Wow, that's awesome" # [1] "Wow, that's awesome" # [1] "Damn, it's NA" # [1] "Damn, it's NA" # [1] "Wow, that's awesome" # ... |

for(i in 1:length(data$x_num)) { if(is.na(data$x_num[i])) { print("Damn, it's NA") } else { print("Wow, that's awesome") } } # [1] "Wow, that's awesome" # [1] "Wow, that's awesome" # [1] "Wow, that's awesome" # [1] "Damn, it's NA" # [1] "Damn, it's NA" # [1] "Wow, that's awesome" # ...

Note: Within the if statement we use *is na* instead of *equal to* — the approach we would usually use in case of observed values (e.g. if(x[i] == 5)).

Even easier to apply: the ifelse function.

ifelse(is.na(data$x_num), "Damn, it's NA", "Wow, that's awesome") # [1] "Wow, that's awesome" "Wow, that's awesome" "Wow, that's awesome" "Damn, it's NA" # [5] "Damn, it's NA" "Wow, that's awesome" ... |

ifelse(is.na(data$x_num), "Damn, it's NA", "Wow, that's awesome") # [1] "Wow, that's awesome" "Wow, that's awesome" "Wow, that's awesome" "Damn, it's NA" # [5] "Damn, it's NA" "Wow, that's awesome" ...

## Video Examples for the Handling of NAs in R

You want to learn even more possibilities to deal with NAs in R? Then definitely check out the following video of my YouTuber channel.

In the video, I provide **further examples for is.na**. I also speak about other functions for the handling of missing data in R data frames.

## Now it’s on You!

I’ve shown you the most important ways to use the is.na R function.

However, there are **hundreds of different possibilities** to apply is.na in a useful way.

Do you know any other helpful applications? Or do you have a question about the usage of is.na in a specific scenario?

Don’t hesitate to let me know in the comments!

## Appendix

The header graphic of this page illustrates NA values in our data. The graphic can be produced with the following R code:

N <- 2000 # Sample size x <- runif(N) # Uniformly distributed variables y <- runif(N) x_NA <- runif(50) # Random NAs y_NA <- runif(50) par(bg = "#1b98e0") # Set background color par(mar = c(0, 0, 0, 0)) # Remove space around plot pch_numb <- as.character( # Specify plotted numbers round(runif(N, 0, 9))) plot(x, y, # Plot cex = 2, pch = pch_numb, col = "#353436") text(x_NA, y_NA, cex = 2, # Add NA values to plot "NA", col = "red") points(x[1:500], y[1:500], # Overlay NA values with numbers cex = 2, pch = pch_numb, col = "#353436") |

N <- 2000 # Sample size x <- runif(N) # Uniformly distributed variables y <- runif(N) x_NA <- runif(50) # Random NAs y_NA <- runif(50) par(bg = "#1b98e0") # Set background color par(mar = c(0, 0, 0, 0)) # Remove space around plot pch_numb <- as.character( # Specify plotted numbers round(runif(N, 0, 9))) plot(x, y, # Plot cex = 2, pch = pch_numb, col = "#353436") text(x_NA, y_NA, cex = 2, # Add NA values to plot "NA", col = "red") points(x[1:500], y[1:500], # Overlay NA values with numbers cex = 2, pch = pch_numb, col = "#353436")

### Subscribe to my free statistics newsletter:

### R Tutorials

abs Function in R

all & any R Functions

Set Aspect Ratio of Plot

attach & detach R Functions

attr, attributes & structure in R

cbind R Command

Change ggplot2 Legend Title

Character to Numeric in R

Check if Object is Defined

col & row sums, means & medians

Complete Cases in R

Concatenate Vector of Strings

Convert Date to Weekday

cumsum R Function

Data Frame Column to Numeric

diff Command in R

difftime R Function

dim Function in R

dir R Function

Disable Scientific Notation

Draw Segments in R

droplevels R Example

Evaluate an Expression

Extract Characters from String

Factor to Numeric in R

Format Decimal Places

get, get0 & mget in R

is.na R Function

is.null Function in R

jitter R Function

Join Data with dplyr Package

length Function in R

lowess R Smoothing Function

max and min Functions in R

NA Omit in R

nchar R Function

ncol Function in R

nrow Function in R

outer Function in R

pairs & ggpairs Plot

parse, deparse & R expression

paste & paste0 Functions in R

pmax and pmin R Functions

polygon Plots in R

pretty R Function

R Find Missing Values

R Functions List (+ Examples)

R NA – Values

R Replace NA with 0

rbind & rbind.fill in R

Read Excel Files in R

readLines, n.readLines & readline

Remove Element from List

Remove Legend in ggplot2

Rename Column Name in R

Replace Last Comma of String

rev R Command

Round Numeric Data in R

Save & Load RData Workspace

scan R Function

setdiff R Function

setNames vs. setnames in R

sink Command in R

Sort, Order & Rank Data in R

sprintf Function in R

Square Root in R

str_c Function of stringr Package

str_sub Function of stringr Package

strptime & strftime Functions

substr & substring R Commands

sweep R Function

Transform Data Frames

union Function in R

unlist in R

weekdays, months, quarters & julian in R

with & within R Functions

Write Excel File in R