Scientific Workflows Within the Process Mining Domain

Within the process mining domain there is currently no support for the construction and execution of a workflow which describes all analysis steps and their order. However, within the scientific workflow domain, special kinds of workflow management systems (e.g. KNIME, RapdMiner, Taverna) exist which are designed to compose and execute a series of computational or data manipulation steps, or a workflow, in a scientific application. For example, a very simple scientific workflow may consist of 3 sequential steps in which an excel file is read, a calculation step is performed in which for the first column all numbers are multiplied by 2, and finally the results are saved in the same excel file. Note that in reality, scientific workflows are typically much more complex and time intensive. All together, knowledge from the scientific workflow domain can be used for the design and execution of workflows within the process mining domain.

In particular, in case concepts of the scientific workflow domain are applied within the process mining domain we can think of the following advantages:

  • Process mining analyses can be easily repeated: For example, imagine the following analysis that is executed: A log is read, a Petri net is discovered using a discovery algorithm and finally the fitness between the discovered Petri net and the log is determined. Within the process mining framework ProM 6 these steps require more than 15 mouse clicks whereas the same analysis within a scientific workflow system can be executed with just one click of a button.
  • An earlier performed experiment can be easily repeated: Within a scientific workflow management system the settings of each step are saved.
  • Experimentation: Within a scientific workflow management system, large scale scientific experiments can be performed. For example, a process mining algorithm can be tested against different parameter settings.
  • Usage of data access, data transformation, analysis, mining, data visualization, and data exploitation techniques already available within a particular scientific workflow management system: Within a scientific workflow management systems these techniques are readily available. Therefore, they can be used in combination with process mining techniques.

Based on the aforementioned advantages it has been decided to connect the process mining framework ProM and the RapidMiner data analysis solution. As such any discovery, conformance, or extension algorithm of ProM can be used within a RapidMiner analysis process or a dedicated process mining analysis can be constructed. An example can be seen in the picture below.

That is, in this workflow, four plugins of ProM 6 are executed.

  • Read Log File Operator: A log file is read.
  • Analyse using Dotted Chart Operator: A visual overview of the events in the log are provided using the Dotted Chart plug-in.
  • ILP Miner Operator: A Petri net is discovered using the ILP Miner.
  • Replay a Log on Petri Net for Performance / Conformance Analysis: The performance of the process is calculated and projected on the discovered Petri net. A red colored transition indicates that a high average waiting time exists for the event associated to the activity.