Process mining dependency matrix issue

28 Views Asked by At

I've been playing around with various tools to try and understand / generate process mining dependency matrices.

This is my smalldataset :

 patient handling handling_id registration_type time      employee

test1  C1    NULL9    complete     2023-01-31 13:01:28 NULL    

test1  C2    NULL12   complete     2023-01-31 12:41:42 NULL    

test1  A1    NULL13   complete     2023-01-31 10:01:46 NULL    

test1  I1    NULL14   complete     2023-01-31 10:00:33 NULL    

test2  M1    NULL16   complete     2023-02-02 21:58:00 NULL    

test2  C2    NULL18   complete     2023-01-31 12:57:44 NULL    

test2  A1    NULL20   complete     2023-01-31 10:15:00 NULL    

test2  I1    NULL21   complete     2023-01-31 10:14:17 NULL    

test3  C2    NULL27   complete     2023-01-31 14:23:47 NULL    

test3  A1    NULL29   complete     2023-01-31 10:18:00 NULL    

test3  I1    NULL30   complete     2023-01-31 10:17:21 NULL    

test4  A1    NULL38   complete     2023-01-31 12:49:00 NULL    

test4  C2    NULL39   complete     2023-01-31 12:34:08 NULL    

test4  A1    NULL40   complete     2023-01-31 12:34:00 NULL    

test4  C1    NULL41   complete     2023-01-31 11:04:33 NULL    

test4  A1    NULL42   complete     2023-01-31 10:46:00 NULL    

test4  I1    NULL43   complete     2023-01-31 10:45:25 NULL   

When I run a dependency matrix in bupaR

 causal_net(smalldataset, dependency_matrix ( smalldataset, all_connected = TRUE ))

I get this :

Nodes
# A tibble: 7 x 12
  act   from_id     n n_distinct_cases bindings_input bindings_output label   color_level
  <chr>   <int> <dbl>            <dbl> <list>         <list>          <chr>         <dbl>
1 A1          1     6                4 <list [1]>     <list [2]>      "A1\n6"           6
2 C1          2     2                2 <list [1]>     <list [1]>      "C1\n2"           2
3 C2          3     4                4 <list [1]>     <list [3]>      "C2\n4"           4
4 End         4     4                4 <list [4]>     <list [0]>      "End"             4
5 I1          5     4                4 <list [1]>     <list [1]>      "I1\n4"           4
6 M1          6     1                1 <list [1]>     <list [1]>      "M1\n1"           1
7 Start       7     4                4 <list [0]>     <list [1]>      "Start"           4


Edges
# A tibble: 9 x 8
  antecedent consequent   dep from_id to_id     n label penwidth
  <chr>      <chr>      <dbl>   <int> <int> <dbl> <chr>    <dbl>
1 I1         A1           0.8       5     1     4 4          4.2
2 C2         C1           0.5       3     2     1 1          1.8
3 A1         C2           0.5       1     3     5 5          5  
4 A1         End          0.5       1     4     1 1          1.8
5 C1         End          0.5       2     4     2 2          2.6
6 C2         End          0.5       3     4     2 2          2.6
7 M1         End          0.5       6     4     1 1          1.8
8 Start      I1           0.8       7     5     4 4          4.2
9 C2         M1           0.5       3     6     1 1          1.8

I was looking at one of the van der Aalst videos on dependency graphs, and I would have thought that if you apply the antecedent/consequent approach to calculate the matrix e.g. as b follows a etc.

We get values like 0.5 which is basically impossible as I understand it. e.g. for A1 --> C2 you have a count of 5 so (5-0) / (5+0+1) = 0.8333, not 0.5, so am not sure why this is occurring.

Thanks

0

There are 0 best solutions below