I mined rules using the apriori algorithm in R.
library(arules)
data("Adult")
rules <- apriori(Adult, parameter = list(supp = 0.05, conf = 0.5, minlen = 2, maxlen=5),
appearance= list(rhs=c("native-country=United-States"),default="lhs"))
From the rules generated, I extracted rules with two elements in the lhs making sure that these rules had the variable of interest ("race=White") in the lhs.
rules.sub2 <- subset(rules, size(lhs)==2)
item <- "race=White"
rules.sub <- subset(rules.sub2, lhs %in% item & size(lhs)==2)
#inspect(rules.sub)
inspect(sort(rules.sub, by = "lift")[1:10])
The results looks like this
lhs rhs support confidence coverage lift
[1] {education=Some-college, race=White} => {native-country=United-States} 0.18041849 0.9486489 0.19018468 1.057080
[2] {marital-status=Divorced, race=White} => {native-country=United-States} 0.11031489 0.9479240 0.11637525 1.056272
[3] {occupation=Sales, race=White} => {native-country=United-States} 0.09524589 0.9474542 0.10052823 1.055748
[4] {occupation=Exec-managerial, race=White} => {native-country=United-States} 0.10595389 0.9453782 0.11207567 1.053435
[5] {relationship=Own-child, race=White} => {native-country=United-States} 0.12327505 0.9453603 0.13040007 1.053415
[6] {race=White, income=large} => {native-country=United-States} 0.13725892 0.9419699 0.14571475 1.049637
[7] {education=HS-grad, race=White} => {native-country=United-States} 0.25783137 0.9406887 0.27408788 1.048210
[8] {occupation=Adm-clerical, race=White} => {native-country=United-States} 0.08803898 0.9392748 0.09373081 1.046634
[9] {race=White, hours-per-week=Over-time} => {native-country=United-States} 0.22290242 0.9389392 0.23739814 1.046260
[10] {education=Bachelors, race=White} => {native-country=United-States} 0.13484296 0.9363094 0.14401540 1.043330
count
[1] 8812
[2] 5388
[3] 4652
[4] 5175
[5] 6021
[6] 6704
[7] 12593
[8] 4300
[9] 10887
[10] 6586
From these rules, I want to extract rules whose first element in the lhs begins with "race=White". I tried the code below, but that won't work. I am not sure how to do this. Any help would be much appreciated.
# Extract the first elements of lhs in the rules
first_elements_lhs <- sapply(slot(rules.sub2, "lhs"), function(x) {
+ elements <- slot(x, "items")
+ if (length(elements) > 0) elements[[1]] else NA
+ })