I've been playing around with various tools to try and understand / generate process mining dependency matrices.
This is my smalldataset :
patient handling handling_id registration_type time employee
test1 C1 NULL9 complete 2023-01-31 13:01:28 NULL
test1 C2 NULL12 complete 2023-01-31 12:41:42 NULL
test1 A1 NULL13 complete 2023-01-31 10:01:46 NULL
test1 I1 NULL14 complete 2023-01-31 10:00:33 NULL
test2 M1 NULL16 complete 2023-02-02 21:58:00 NULL
test2 C2 NULL18 complete 2023-01-31 12:57:44 NULL
test2 A1 NULL20 complete 2023-01-31 10:15:00 NULL
test2 I1 NULL21 complete 2023-01-31 10:14:17 NULL
test3 C2 NULL27 complete 2023-01-31 14:23:47 NULL
test3 A1 NULL29 complete 2023-01-31 10:18:00 NULL
test3 I1 NULL30 complete 2023-01-31 10:17:21 NULL
test4 A1 NULL38 complete 2023-01-31 12:49:00 NULL
test4 C2 NULL39 complete 2023-01-31 12:34:08 NULL
test4 A1 NULL40 complete 2023-01-31 12:34:00 NULL
test4 C1 NULL41 complete 2023-01-31 11:04:33 NULL
test4 A1 NULL42 complete 2023-01-31 10:46:00 NULL
test4 I1 NULL43 complete 2023-01-31 10:45:25 NULL
When I run a dependency matrix in bupaR
causal_net(smalldataset, dependency_matrix ( smalldataset, all_connected = TRUE ))
I get this :
Nodes
# A tibble: 7 x 12
act from_id n n_distinct_cases bindings_input bindings_output label color_level
<chr> <int> <dbl> <dbl> <list> <list> <chr> <dbl>
1 A1 1 6 4 <list [1]> <list [2]> "A1\n6" 6
2 C1 2 2 2 <list [1]> <list [1]> "C1\n2" 2
3 C2 3 4 4 <list [1]> <list [3]> "C2\n4" 4
4 End 4 4 4 <list [4]> <list [0]> "End" 4
5 I1 5 4 4 <list [1]> <list [1]> "I1\n4" 4
6 M1 6 1 1 <list [1]> <list [1]> "M1\n1" 1
7 Start 7 4 4 <list [0]> <list [1]> "Start" 4
Edges
# A tibble: 9 x 8
antecedent consequent dep from_id to_id n label penwidth
<chr> <chr> <dbl> <int> <int> <dbl> <chr> <dbl>
1 I1 A1 0.8 5 1 4 4 4.2
2 C2 C1 0.5 3 2 1 1 1.8
3 A1 C2 0.5 1 3 5 5 5
4 A1 End 0.5 1 4 1 1 1.8
5 C1 End 0.5 2 4 2 2 2.6
6 C2 End 0.5 3 4 2 2 2.6
7 M1 End 0.5 6 4 1 1 1.8
8 Start I1 0.8 7 5 4 4 4.2
9 C2 M1 0.5 3 6 1 1 1.8
I was looking at one of the van der Aalst videos on dependency graphs, and I would have thought that if you apply the antecedent/consequent approach to calculate the matrix e.g. as b follows a etc.
We get values like 0.5 which is basically impossible as I understand it. e.g. for A1 --> C2 you have a count of 5 so (5-0) / (5+0+1) = 0.8333, not 0.5, so am not sure why this is occurring.
Thanks