I have this df
df = data.frame(x = c('Orange','orange','Appples','orgne','apple','applees','oranges','Oranges',
'orgens','orgaanes','Apples','ORANGES','apple','APPLE') )
using str_replace_all, I know I can replace each one of these terms to a one unified way of writing each of the 2 words orange and apple but it would take forever if you have a lot of terms in the dataframe. Would wanna a simple way of coding in order to unify all the ways of writing into orange and apple.
You can use
agrepfor approximate string matching:You can change the sensitivity of the distances with
max.distance.Another possibility is the
stringdistpackage, which has a number of different distance metrics: