ProM 6 exercises
The exercises provided in this section are meant to become more familiar with ProM 6 and its process mining plug-ins. Plug-ins that will be covered include the Transition System Miner, Transition System Analyzer, the alpha-algorithm, the Heuristics Miner, the Genetic Miner, the Fuzzy Miner and the Dotted Chart Analysis. For each exercise detailed solutions are provided which you can use when you get stuck.
The event log we will start to analyse contains the following three traces:
1x Case1 A B C D 1x Case2 A C B D 1x Case3 A E D
These traces are recorded in exercise1.xes. Please answer the following questions for this event log:
- Import the event log from the file exercise1.xes.
- Inspect the contents of exercise1.xes using the log visualization option in ProM and answer the following questions:
- In what timeframe do the events in exercise1.xes occur?
- When did the event “B” occur for “case2.0” and who executed it?
- What is the relative occurrence and what is the frequency of the event “E”?
- Inspect the contents of exercise1.xes using a text editor. Can you distinguish between the 'header' of the file and the actual traces and events? What does the header include?
- Construct a transition system using the Transition System Miner in ProM 6. Be sure to set the “key classifier collection” type to “list” and the “collection size” to “no limit” in the second wizard screen. Accept all default settings in all the other wizard screens.
- Try to construct a Petri net by hand which can replay the traces recorded in exercise1.xes. Use 5 transitions that represent the actions A through E. Try to allow as little extra behavior as possible (e.g. no 'flower'-nets).
- Create a Petri net from the transition system using the “Transition System to Petrinet” plug-in in ProM. Hint: with the result of the Transition System Miner open, press the “play”-button on the top right.
Exercise 3 (extra)
For this exercise we will use the traces shown below which are stored in exercise4.xes.
1x Case1 a b c d f 1x Case2 a c b d f 1x Case3 a b d c f 1x Case4 a c d b f 1x Case5 a d e f 1x Case6 a e d f
- Since it is always a good idea to inspect the event log at hand, inspect exercise4.xes to get an idea of its contents. What is remarkable about this event log?
- Try to construct a Petri net by hand which can replay the traces recorded in exercise4.xes. Use 6 transitions that represent the actions A through F. Try to allow as little extra behavior as possible (e.g. no 'flower'-nets).
- Now create three Petri nets using the following plug-ins:
- Run the alpha-algorithm on exercise4.xes;
- Run the ILP Miner on exercise4.xes.
- Compare the three Petri nets created by the different algorithms. Are there any fundamental differences or are they bisimilar? Why do you think that there is a difference between (some of) the Petri nets?
- Now run the Heuristics Miner on exercise4.xes. Is this a Petri net? What is different?
- Create a new visualization of the HeuristicsNet that shows the join and split semantics. Is this a similar net as the Petri nets from the previous steps?
- Now run the Genetic Miner on exercise4.xes. Why is the result with the default settings so poor? Try to improve the setting and get a result with a fitness of at least 0.8.
For this exercise we will use exercise5.xes which is the result of a simulation.
- Start with a log inspection on exercise5.xes
- Create some proces model(s) using two (or more) of the plug-ins from the previous exercises.
- Run the dotted chart analysis plug-in on exercise5.xes and try to answer the following questions:
- What do you see using the default settings? What does each row, column and dot represent?
- What does the diagonal line from the top left to the middle of the last row represent?
- Using the default settings, what do you think is the meaning of the row of dots that is located below and to the left of the diagonal line?
- Can we see a clear division of tasks that are executed in the beginning and (repeated) near the end of the process?
- Change the settings to “Time option: Relative(Time)” and “Sort By: Actual duration”. What do we see now?
- Change the settings to “Component type: Originator” , “Time option: Actual” and “Sort By: None”. What do we see now?
- Using these settings, can we distinguish groups of users doing similar tasks?
For this exercise we will use exercise6.xes which is the result of a simulation.
- Start with a log inspection on exercise6.xes. What is remarkable about this event log?
- Optional: run the Dotted Chart Analysis on this event log to visualize it in another way.
- Create some process model(s) using two (or more) of the plug-ins from the previous exercises.
- Run the Fuzzy Miner on the event log, using the default settings. Now try to answer the following questions:
- What is remarkable about the resulting process model compared to the models discovered by the other plug-ins?
- The Fuzzy Miner tries to aggregate low frequent activities into clusters and only show edges which are significant enough. Using this brief explanation, can you indicate why the resulting process model looks so different?
- Read the documentation on the Fuzzy Miner.
- Play around with the settings of the Fuzzy Miner (the sliders you see on the right). What do you need to change to get closer to the models as discovered by the other plug-ins?
- Why do you think this model is still not exactly the same as the original model?
- On what kind of logs would the Fuzzy Miner be useful?
- Can you now think of some benefits but also some weak points of the Fuzzy Miner?
- From the resulting transition system, press the “Play” button on the top right. Select the “Transition system analyzer” plug-in. Now click the “Log” object in the “input”column and select exercise6.xes. Now run the plug-in (press “Start”) and try to answer the following questions:
- How long does a case take on average? And what are the minimal and maximal durations recorded? On how many ways can you get this information from this plug-in?
- What do the red states indicate? And what do the red arrows indicate?
- On the top left, change the “Color By:” setting from “Sojourn” to “Elapsed”. Why is part of the process colored red and why are the last two events yellow?
- If we change the setting to remaining we see the first events colored red and 1 event colored yellow. Can you explain this? And why do you think that the outgoing arrow of the yellow event is colored red?