Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.
R -> L
or if R then L
Metrics
- Support: frequency which items occur in the database
- Confidence: frequency the items in R are in the transactions that contain L
- Independence (https://en.wikipedia.org/wiki/Independence_(probability_theory))
- Lift (https://en.wikipedia.org/wiki/Lift_(data_mining))
- = 1 L and R are independent
- < 1 negative dependence
-
1 positive dependence
- When is close to 1, the association rule is normally not useful
- Leverage
- = 0 L and R are independent
- < 0 negative dependence
-
0 positive dependence
- Odds Ratio
Apriori Algorithm library(arules)
- https://en.wikipedia.org/wiki/Apriori_algorithm
- minsup can be different for each iteration of the Algorithm
- a lot of the rules that are returned may not be useful
- we can keep only more general rules (i.e. {A} => {C} rather than {A,B} => {C})