I am working on a problem aiming to identify a rule set for the approval of applications. I am using the apriori algorithm in the arules package to find association rules, which align to said approvals from historic data. What I want to do is understand if the rule set I have covers all approvals I have in my dataset, as opposed to the supports for individual rules. As a theoretical example, using the iris data, trying to find all rules for predicting Species = versicolor:
rules_1 <- apriori(iris, parameter = list(support = 0.02,
confidence = 0.95,
target = 'rules'),
appearance = list(rhs = "Species=versicolor"))
inspect(rules_1)
The output looks like the following (truncated to 2 lines, with there being 21 other rules)
| lhs | rhs | support | confidence | coverage | lift | count |
|---|---|---|---|---|---|---|
| {Sepal.Length=[4.3,5.4), Petal.Width=[0.867,1.6)} | => {Species=versicolor} | 0.03333333 | 1 | 0.03333333 | 3 | 5 |
| {Sepal.Width=[2.9,3.2), Petal.Width=[0.867,1.6)} | => {Species=versicolor} | 0.11333333 | 1 | 0.11333333 | 3 | 17 |
The idea is, with there being only 1 rhs, how can I extract the lhs column in a way that lets me filter the data by all of these rules at once, and then see how many rows I get (knowing that there are 50 rows for versicolor).