R: Replacing NAs in all factors with 'Missing'
With a simple combination of mutate_if and fct_explicit_na, you can replace all NAs in all factors with “Missing”:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| library(dplyr) # gives mutate_if | |
| library(forcats) # gives fct_explicit_na | |
| #example dataframe, a and c are factors, | |
| #b is numeric, d is boolean (TRUE/FALSE) | |
| mydata = data.frame( | |
| a = c( 'Yes', 'No', NA), | |
| b = c( 0.5, NA, 0.6), | |
| c = c( 'No', NA, 'Yes'), | |
| d = c( TRUE, NA, FALSE) | |
| ) | |
| # Making the missing fields in the columns which are factors explicit | |
| #(by default, fct_explicit_na changes NAs "(Missing)"): | |
| newdata1 = mydata %>% | |
| mutate_if(is.factor, fct_explicit_na) | |
| # changing the missing label to "Dunno" | |
| #(note how the syntax is a little bit different | |
| # than when using fct_explicit_na on a single column) | |
| newdata2 = mydata %>% | |
| mutate_if(is.factor, fct_explicit_na, na_level = 'Dunno') | |
| # on a single column it would look like: | |
| mydata$a %>% fct_explicit_na(na_level = 'Dunno') |
dplyr reference: http://dplyr.tidyverse.org/reference
forcats reference: http://dplyr.tidyverse.org/reference