How to code it in a more efficient way : delete multiple row with a very complex condition in R

Question

How to code it in a more efficient way : delete multiple row with a very complex condition in R

39 Views Asked by Enialoj At 12 April 2023 at 19:01

Below is a sample of a large data set from which I want to delete quadrats (Qm) numbered greater than than 3 in parcels (PARCELLE) 1, 3, 4, and 8.

FIELD   SECTOR  PARCELLE    Qm  Total
North   A   1   1   2
North   A   1   2   3
North   A   1   3   0.5
North   A   1   4   0.5
North   A   1   5   1
North   A   1   6   0.5
North   B   2   1   10
North   B   2   2   3
North   B   2   3   4
North   B   2   4   2
North   B   2   5   7
North   B   2   6   25
North   C   3   1   0
North   C   3   2   0
North   C   3   3   2
North   C   3   4   5
North   C   3   5   0.5
North   C   3   6   1
North   D   4   1   0
North   D   4   2   0
North   D   4   3   0
North   D   4   4   0
North   D   4   5   0
North   D   4   6   85
North   E   5   1   0
North   E   5   2   5
North   E   5   3   0.5
North   E   5   4   0
North   E   5   5   0
North   E   5   6   0
North   F   6   1   0.5
North   F   6   2   0.5
North   F   6   3   0.5
North   F   6   4   0
North   F   6   5   0
North   F   6   6   0
North   G   7   1   0.5
North   G   7   2   0.5
North   G   7   3   2
North   G   7   4   2
North   G   7   5   0.5
North   G   7   6   0
North   H   8   1   0.5
North   H   8   2   1
North   H   8   3   60
North   H   8   4   0.5
North   H   8   5   0.5
North   H   8   6   1

I have achieved this manipulation with one statement for each parcel.

New_Data <- Data_Frame[!(Data_Frame$PARCELLE == "1" & Data_Frame$Qm > 3), ]
New_Data <- New_Data[!(New_Data$PARCELLE == "3" & New_Data$Qm > 3), ]
New_Data <- New_Data[!(New_Data$PARCELLE == "4" & New_Data$Qm > 3), ]
New_Data <- New_Data[!(New_Data$PARCELLE == "8" & New_Data$Qm > 3), ]

I want to condense my code but I can't figure out how to specify a condition on the parcel number. I would like my code to resemble something like this:

New_Data <- Data_Frame[!(Data_Frame$PARCELLE == "1 & 3 & 4 & 8" & Data_Frame$Qm > 3), ]

Original Q&A

There are 2 best solutions below

Juan C On 12 April 2023 at 19:08

This should do:

df %>% filter(!(PARCELLE %in% c(1, 3, 4, 8) & Qm > 3))


# FIELD SECTOR PARCELLE Qm Total
# 1  North      A        1  1   2.0
# 2  North      A        1  2   3.0
# 3  North      A        1  3   0.5
# 4  North      B        2  1  10.0
# 5  North      B        2  2   3.0
# 6  North      B        2  3   4.0
# 7  North      B        2  4   2.0
# 8  North      B        2  5   7.0
# 9  North      B        2  6  25.0
# 10 North      C        3  1   0.0
# 11 North      C        3  2   0.0
# 12 North      C        3  3   2.0
# 13 North      D        4  1   0.0
# 14 North      D        4  2   0.0
# 15 North      D        4  3   0.0
# 16 North      E        5  1   0.0
# 17 North      E        5  2   5.0
# 18 North      E        5  3   0.5
# 19 North      E        5  4   0.0
# 20 North      E        5  5   0.0
# 21 North      E        5  6   0.0
# 22 North      F        6  1   0.5
# 23 North      F        6  2   0.5
# 24 North      F        6  3   0.5
# 25 North      F        6  4   0.0
# 26 North      F        6  5   0.0
# 27 North      F        6  6   0.0
# 28 North      G        7  1   0.5
# 29 North      G        7  2   0.5
# 30 North      G        7  3   2.0
# 31 North      G        7  4   2.0
# 32 North      G        7  5   0.5
# 33 North      G        7  6   0.0
# 34 North      H        8  1   0.5
# 35 North      H        8  2   1.0
# 36 North      H        8  3  60.0

**Onyambu** · Accepted Answer · 2023-04-12T19:10:24.917000

Use %in% operator:

Data_Frame[!(Data_Frame$PARCELLE %in% c(1, 2, 3) & Data_Frame$Qm>3),]

You can also use the following:

 subset(Data_Frame, !(PARCELLE %in% c(1, 2, 3) & Qm > 3))

The two are only different in terms of how they treat NA with the first returning NA where the data was NA while the second drops the NA data

How to code it in a more efficient way : delete multiple row with a very complex condition in R

There are 2 best solutions below

Related Questions in R

Related Questions in CODING-EFFICIENCY

Trending Questions

Popular # Hahtags

Popular Questions