links: http://www.cs.carleton.edu/cs_comps/0607/recommend/recommender/association.html
When we go grocery shopping, we often have a standard list of things to buy. Each shopper has a distinctive list, depending on one’s needs and preferences. A housewife might buy healthy ingredients for a family dinner, while a bachelor might buy beer and chips. Understanding these buying patterns can help to increase sales in several ways. If there is a pair of items, X and Y, that are frequently bought together:
Usage
- Both X and Y can be placed on the same shelf, so that buyers of one item would be prompted to buy the other.
- Promotional discounts could be applied to just one out of the two items.
- Advertisements on X could be targeted at buyers who purchase Y.
- X and Y could be combined into a new product, such as having Y in flavors of X.
While we may know that certain items are frequently bought together, the question is, how do we uncover these associations?
Besides increasing sales profits, association rules can also be used in other fields. In medical diagnosis for instance, understanding which symptoms tend to co-morbid can help to improve patient care and medicine prescription.
Understand Association Rules
Support
This says how popular an itemset is, as measured by the proportion of transactions in which an itemset appears.
the support of {apple} is 4 out of 8
Confidence.
This says how likely item Y is purchased when item X is purchased, expressed as {X -> Y}. This is measured by the proportion of transactions with item X, in which item Y also appears
One drawback of the confidence measure is that it might misrepresent the importance of an association.
This is because it only accounts for how popular apples are, but not beers. If beers are also very popular in general, there will be a higher chance that a transaction containing apples will also contain beers, thus inflating the confidence measure. To account for the base popularity of both constituent items, we use a third measure called lift.
Lift.
This says how likely item Y is purchased when item X is purchased, while controlling for how popular item Y is.
Python Implementation
from mlxtend.frequent_patterns import apriori
from mlxtend.frequent_patterns import association_rules
1- Get data of all the items purchased by invoices or transaction history
Prepare the Data
Implement the Algorithm
Apriori will give Association Rules which have —
Support , confidence , Lift
Also it gives you the the Item 1 that is the purchased item and Item 2 ( can be a group of items bought along with item 1)
Example of a few association rule are :
We can now convert all of them to a dataset and get the associations
To just get the pairings
rule.items_add
rule.items.base