R - Using sucessive row (cell) values in one table to screen row (cell) values in another

41 Views Asked by ObiWanCodi At 26 January 2024 at 12:18

I have two dataframes - df1 has 52000 rows, and df2 has 24000 rows

I need to work through each value in df1.column 2, ,and check row by row if it appears anywhere in df2.

If it does, then add the entire row from df2 into a new dataframe.

I have set up two dummy tables with small amounts of example data :

This is df1

Year	Drink
1985	tea
1935	coffee
2015	beer
2012	wine
2017	tea
1958	soda

This is df2

Year	Country
1985	USA
1955	France
2015	China
2011	USA
2017	UK
1958	UK

Step 1 - read col1 row 1 cell from df1 - it reads 1985.

Step 2 - work through df2 col1 row values in turn - if 1985 is there. Copy the entire row to a new dataframe. If not, ignore row and continue.

Repeat step 1 and Step 2 until end of all rows in df2.

I have tried:

YearComparison <- df1[df1$year %like% df2, ]

but I get the error:

Warning message: In grepl(pattern, vector, ignore.case = ignore.case, fixed = fixed) : argument 'pattern' has length > 1 and only the first element will be used

I also tried :

YearComparison <- df1[df1$year %like% df2,1 ]

which returned:

Name	Type	Value
YearComparison	Double [0]

I also tried:

YearComparison <- any(grepl('patientdata$status', countries$year,))

Which returned:

Name	Type	Value
YearComparison	Logical[1]	False

I have also tried variations using %in%, but with similar results.

Please remember in my actual data sets I have tens of thousands of rows, they are complex non-sequential strings (not dates - which I am just using here for ease to perfect the code) so something like:

YearComparison <- df1[df1$year %like% df2, c("1985", "1986","Etc"), ] isn't practical.

Can anyone help? Many thanks.

Original Q&A

There are 1 best solutions below

MetehanGungor On 26 January 2024 at 12:27

I guess, you need mutating join functions.

df1 <- data.frame(Year = c(1985, 1935, 2015),
                  Drink = c("tea", "coffee", "beer"))

df2 <- data.frame(Year = c(1985, 1955, 2015),
                  Country = c("USA", "France", "China"))
df1
df2


library(dplyr)

df3 <- df1 %>% 
  inner_join(df2, by = c("Year" = "Year"))

df3

Extra information: Pls, be aware of the difference between inner_join() and left_join(), right_join(), full_join().

R - Using sucessive row (cell) values in one table to screen row (cell) values in another

There are 1 best solutions below

Related Questions in R

Related Questions in DATAFRAME

Related Questions in FILTERING

Related Questions in GREPL

Trending Questions

Popular # Hahtags

Popular Questions