I have a dataframe with names I have a second dataframe with a dictionary of names and the sex of those names I want to check if the name is in the dictionary if it is then add the sex from the dictionary table to the names dataframe
My code looks like below
# Sets everything to -1, 1 for male, 0 for female
train$sex <- "-1"
train$sex[toupper(train$fname) == nam_dict$Name]<-nam_dict$Sex
I am getting the following error
Error in train$sex[toupper(train$fname) == nam_dict$Name] <- nam_dict$Sex :
NAs are not allowed in subscripted assignments
In addition: Warning message:
In toupper(train$fname) == nam_dict$Name :
longer object length is not a multiple of shorter object length
I have a work around - i think - where i can split the dictionary into male and female and simply replace the <-nam_dict$Sex portion of the code with the character 'F' or 'M' depending on the dictionary
I just thought there would be a better way
So your problem can be solved by a quick and simple implementation of
match. First, here's a quick reproducible exampleA possible solution
Al bit more advanced solution (if you''ll ever work with huge data sets) is to try
data.tables binary join which allows you to update your data by reference, chose which columns to update while everything is done in a lightning speed.First we will convert both data sets to
data.tableclass and make both columns lower case, then we willkeythem by the column we want join by, finally we do a left join totrainwhile creating thesexcolumn by reference and pulling the data fromi(Sex column innam_dict)