The objectives of this section are:
to explain the problem of rule generation
to define confidence-based pruning
to show how rule generation works in the Apriori Algorithm
By the time you have completed this section you will be able to:
generate rules based on frequent itemset
use the Confidence theorem presented to prune the rule set.
explain how rule generation occurs in the Apriori Algorithm
Before we move forward let’s do a quick recap. Association Analysis is a branch of data mining which finds relationships between the various items in the given data set by creating association rules. At this point you should be able to calculate the support and confidence for any itemset presented and also be familiar with the Apriori Algorithm which is used for Frequent Itemset Generation. Finding the frequent itemset is just the first part, not all frequent itemsets are strong association rules and so we must continue. We know that frequent itemsets have passed the minimum support but we don’t know how about the reliability of the rules that currently exist. In this section we generate candidate association rules from the frequent itemset and then determine which ones are strong association rules based on whether they pass the confidence test.
There are two main steps the first is extraction and the second is calculation. The video clip below explains both of these steps in great detail.