Process Mining 101

Apromore Resources

Process Mining 101

This 90-minute introductory tutorial offers a tour of the four main process mining capabilities using Apromore Enterprise Edition.

Introduction

Modern enterprise systems maintain detailed records of events that occur during the execution of the business processes they support. For example, a customer relationship management (CRM) system keeps track of virtually every interaction from the moment a customer makes an initial inquiry, until the moment the customer places their first purchase order. Meanwhile, an enterprise resource planning (ERP) system at a manufacturing company keeps track of all purchasing events, inventory movements, invoice approvals, and other transactions that occur within the company’s purchase-to-pay process.

The same can be said of industry-specific enterprise systems such as lending management systems (banking), claims management systems (insurance), and hospital management systems (healthcare). Every one of these systems supports one or more end-to-end processes and in doing so, they collect records of every step in the process.

These records allow us to retrace the execution of every instance of a process from start to end. For example, the events in a CRM system allow us to see how a customer inquiry becomes a quotation, how a quotation becomes a purchase order, and how this purchase order is delivered and invoiced.

Process mining is a collection of methods to extract and consolidate records of the execution of a business process and to analyze these records by means of different types of visualizations. These visualizations allow us to identify issues and opportunities for improvement such as bottlenecks, sources of waste and root causes of service-level violations.

The starting point for process mining is a collection of records representing every step in the execution of an end-to-end business process, such as an order-to-cash process or a claim-to-resolution process. This collection of records is called an event log. Given an event log, process mining tools provide several analytic capabilities to uncover business process improvement opportunities from different perspectives.

Is your business ready for process mining?
Learn more about the benefits of process mining in our introductory guide “What is Process Mining?”.

The four key capabilities of process mining

Process mining tools such as Apromore provide four main analytics capabilities: automated discovery, conformance checking, performance mining and variant analysis.

Automated process discovery

Automated process discovery techniques take as input an event log and produce an “as-is” model of the business process. In Apromore, the discovered model can take the form of a process map or of a BPMN model.

A process map (also called a dependency graph) shows the activities of the process and the transitions between consecutive activities, also known as “directly-follows relations” between activities. The activities (nodes) and the directly-follows relations may be annotated with frequency information.

A BPMN process model shows activities and flows (directly-follows relations) but it also shows different types of gateways, particularly exclusive gateways and parallel gateways. This view allows us to better appreciate the decision points, rework loops and parallel branches of a process.

Automated process discovery techniques also allow us to visualize the social network of a process, in other words, the workers who intervene in the process and the handoffs between these workers.

The automated process discovery capabilities of Apromore are encapsulated in the Process Discoverer plugin, as shown in the following demonstration:

Automated Discovery in Apromore

Conformance checking

Conformance checking is about checking that the executions of a business process recorded in an event log abide to a prescribed or expected process behavior. The input of conformance checking can either be a set of compliance rules (called the rule-based approach) or a prescribed process model (called the model-based approach). The output is a list of violations of the compliance rules or a list of deviations with respect to the process model.

In the rule-based approach, a common compliance rule in a purchase-to-pay process is that an invoice cannot be approved, unless the corresponding purchase order has been previously approved. Another recurrent compliance rule is that if an invoice has been approved by a given employee, the corresponding payment must be triggered by a different employee. This latter rule is called the “four-eyes principle”. In this context, the goal of conformance checking is to determine if every execution of the purchase-to-pay process fulfills these and other compliance rules.

The following video demonstrates how to use the rule-based approach for conformance checking in Apromore.

Conformance Checking in Apromore

Performance mining

Performance mining techniques take as input an event log and extract performance analytics of the underlying business process, which help to answer questions such as: “Where are the bottlenecks in the process?”, “Which activities consume the highest amount of effort (processing time)?”, or “How does the process perform when the workload is higher-than-usual?”

Common performance analytics concern the frequency and duration of the various process activities and handovers between activities, e.g. case frequency or average activity duration.

In Apromore, performance analytics can be shown to the user in the form of charts, e.g. via Apromore’s Performance Dashboard, or by “enhancing” a process model automatically discovered from the log via annotations and by color-coding the various elements (e.g. activity in a darker blue color indicates that the activity has been observed more frequently than the others in the log). Performance analytics can also be exported for analysis via third-party business intelligence tools.

The following video demonstrates how to do performance mining in Apromore:

Performance Mining in Apromore

Performance mining can also be done using Key Performance Indicators (also called KPIs). KPIs help businesses understand how their business process performs with respect to a set target. For instance, a business KPI may be that 90% of the cases in the process must be complete within four weeks.

In Apromore, we can create KPIs and determine whether we are meeting our KPI requirements.

If our KPIs are violated, we can also perform a root cause analysis in Apromore to determine the causes of the violations. For instance, the cases that took more than four weeks could be because a third-party merchant takes a longer period to deliver their service. Apromore Root Cause Analyzer reveals such findings to us.

To define a KPI in Apromore, we need to determine a KPI population, KPI condition and KPI target, after which we can perform the root cause analysis. This video describes how to define KPIs in Apromore and perform root cause analysis.

Performance Mining in Apromore

Variant analysis

Variant analysis techniques take as input two or more event logs (corresponding to different variants of the same business process) and produce as output a list of differences. Typically, one of the event logs contains all the cases that end up in a positive outcome according to some criterion, while the other log contains all the cases that end up in a negative outcome. For example, the first log may contain all cases where the customer was satisfied, while the second one contains all cases that led to a complaint. Or the first log may contain all cases where the process completed on time, while the second one contains the delayed cases. Variant analysis techniques help diagnose the reasons why certain executions of a business process (i.e. certain process cases) do not lead to a desirable outcome.

In order to analyze multiple variants of a process, one has to start by extracting the event logs corresponding to each of these variants. In Apromore, this is achieved by means of log filters. Once the logs of the process variants have been extracted, Apromore allows us to compare two or more variants of a business process in three ways:

By means of side-by-side comparison of the process maps or the BPMN process models of these variants. This comparison can be done using the frequency view or the duration view.
By using the variant comparison tool.

The following video demonstrates how to compare multiple variants of a process in Apromore:

Variant analysis in Apromore

Event logs

In order to analyze a business process using process mining techniques, we need to extract an event log from the information system(s) that support the execution of the process. It is possible to extract event logs from almost any enterprise system out there, be it from ERP or CRM systems such as SAP, Dynamics, Salesforce, or ServiceNow, or from vertically specialized systems such as manufacturing execution systems, insurance management systems, hospital management systems, etc.
An event log is a set of event records. Each event record consists of the following attributes:

A reference to the process activity being performed, e.g. ‘Register request’, ‘Examine request, ‘Check ticket’, and so forth.
Additionally, each event recorded in the log must be linkable to a case, via a case identifier, e.g. the order number for an order-to-cash process, or the claim number for a claims handling process.
In order to discover a case, one must be able to order the set of activities recorded in the log according to the time when they occurred. This requires that each event entry must have a completion timestamp, capturing the date and time when a given activity has been completed.
Optionally, additional attributes such as the start timestamp of activity, the resource that performed the activity, or the amount of a request, the geographic area where the request originated from, etc. can be used to obtain more fine-grained insights and to identify different process variants for our analysis.

Time to try yourself

If you wish to replicate by yourself the analysis shown in the video demonstrations of this tutorial, you can sign up for a trial of Apromore Enterprise Edition. Once you sign up, you will find a few sample event logs in the Apromore workspace. The event logs that we used in the above videos can be found in the folder “Example Event Logs”. The event log of the manufacturing process is called Production_Data, while one of the Sepsis patient treatment event logs is called Sepsis Cases.

1. Which activity is the waiting time bottleneck?

Approve Purchase Order.
Confirm Purchase Order.
Analyze Request for Q
Amend Request for Quotation.
Choose best option.

Answer: Analyze Request for Quotation.

Feedback: To find the activity waiting bottleneck, go to the duration overlay. Find the activity with incoming transitions and high average duration. “Analyze Request for Quotation” is the correct answer because it has four incoming transitions with high waiting times. For example, the waiting time from “Request Additional Info from Vendor” to “Analyze Request for Quotation” is 1.6 weeks.

2. Which of the following activities are part of a rework loop?

Amend Request for Quotation
Settle PO dispute with supplier.
Choose best option.
A and B
A and C

Answer: D. A and B.

Feedback: To visualize rework, go to the BPMN model view. We see a rework from “Analyzes Request for Quotation” -> “Amen Request for Quotation” -> “Analyzes Request for Quotation”. There is a rework from “Release Purchase Order” -> “Settle PO dispute with supplier” -> “Release Purchase Order”. But “Choose best option” is not part of any rework loop.

3. Which of the following activities occurs in a large number of cases.

Settle dispute with supplier.
Confirm purchase order.
Create Purchase Requisition.
Request Additional Info from Vendor.
Pay invoice.

Answer: Create Purchase Requisition.

Feedback: In the case frequency overlay, we see the “Create Purchase Requisition” has the highest case frequency of 387.

The event log has been split into variants. One log contains all cases where the procure-to-pay process took less than four weeks. This is called the fast variant. The second log should contain all cases where the process took more than four weeks. This is called the slow variant.

Please download both logs here.

Please find the event logs here.

Procure-to-Pay asis_fast.csv

Procure-to-Pay asis_slow.csv

Given these two logs, answer the following questions using Apromore variant comparison tool:

4. What is the average case duration (also called cycle time) of each of these two variants of the process?

Fast variant – 1.36 weeks and slow variants – 2.85 months.
Fast variant – 1.22 weeks and slow variants – 2.35 months.
Fast variant – 1.12 weeks and slow variants – 2.19 months.
Fast variant – 1.05 weeks and slow variants – 2.52 months.
None of the above.

Answer: A. Fast variant – 1.36 weeks and slow variants – 2.85 months.

Feedback: Select both event logs and compare the variants. By inspecting the temporal statistics, we see the fast variant has an average case duration of1.36 weeks and the slow variants, 2.85 months.

5. What activity occurred in only the fast case but not the slow case.

Amend Request for Quotation
Settle PO dispute with supplier.
Request Additional Info from Vendor.
Pay invoice.
Amend Purchase Requisiti

Answer: E. Amend Purchase Requisition.

Feedback: When the variant comparator is opened on both logs, we see the activity “Amend Request for Quotation” is colored coded for the fast variant.

6. What was the average duration of this new activity.

36 minutes.
2 minutes.
05 minutes.
57 minutes.
None of the above.

Answer: A. 27.36 minutes.

Feedback: By going to the average duration overlay, we see the average duration for “Amend Request for Quotation” is 27.36 minutes.

7. What was the difference in the average duration of the transition from 'Create Small Request for Quotation' to 'Analyze Request for Quotation' between the two variants?

The waiting time is 1.3 days in the fast variant and 4 days in the slow variant.
The waiting time is 2.8 days in the fast variant and 3.03 weeks in the slow variant.
The waiting time is 20 days in the fast variant and 7.1 months in the slow variant.
There is no difference in the average duration.
None of the above.

Answer: B. The waiting time is 2.8 days in the fast variant and 3.03 weeks in the slow variant.

Feedback: In the average duration overlay, we see the waiting time as 2.8 days in the fast variant and 3.03 weeks in the slow variant. This implies that this transition is a major bottleneck in the slow variant but is not in the fast case.

Thinking about using Apromore in a project?

Get in touch to know more about how you can get started with Apromore, with a Proof-of-Value project.

If you want to learn more about the fascinating world of process mining, you can enroll in a public course on process mining by The University of Melbourne. Alternatively, you can contact us to discuss your corporate training requirements on process mining and Apromore. We have a range of training courses delivered both online and face-to-face.