R: Replacing NAs in all factors with 'Missing'
With a simple combination of mutate_if
and fct_explicit_na,
you can replace all NAs in all factors with “Missing”:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(dplyr) # gives mutate_if | |
library(forcats) # gives fct_explicit_na | |
#example dataframe, a and c are factors, | |
#b is numeric, d is boolean (TRUE/FALSE) | |
mydata = data.frame( | |
a = c( 'Yes', 'No', NA), | |
b = c( 0.5, NA, 0.6), | |
c = c( 'No', NA, 'Yes'), | |
d = c( TRUE, NA, FALSE) | |
) | |
# Making the missing fields in the columns which are factors explicit | |
#(by default, fct_explicit_na changes NAs "(Missing)"): | |
newdata1 = mydata %>% | |
mutate_if(is.factor, fct_explicit_na) | |
# changing the missing label to "Dunno" | |
#(note how the syntax is a little bit different | |
# than when using fct_explicit_na on a single column) | |
newdata2 = mydata %>% | |
mutate_if(is.factor, fct_explicit_na, na_level = 'Dunno') | |
# on a single column it would look like: | |
mydata$a %>% fct_explicit_na(na_level = 'Dunno') |
dplyr
reference: http://dplyr.tidyverse.org/reference
forcats
reference: http://dplyr.tidyverse.org/reference