Market Basket Analysis — Apriori Algorithm

Anakin
4 min readDec 5, 2020

--

links: http://www.cs.carleton.edu/cs_comps/0607/recommend/recommender/association.html

When we go grocery shopping, we often have a standard list of things to buy. Each shopper has a distinctive list, depending on one’s needs and preferences. A housewife might buy healthy ingredients for a family dinner, while a bachelor might buy beer and chips. Understanding these buying patterns can help to increase sales in several ways. If there is a pair of items, X and Y, that are frequently bought together:

Usage

  • Both X and Y can be placed on the same shelf, so that buyers of one item would be prompted to buy the other.
  • Promotional discounts could be applied to just one out of the two items.
  • Advertisements on X could be targeted at buyers who purchase Y.
  • X and Y could be combined into a new product, such as having Y in flavors of X.

While we may know that certain items are frequently bought together, the question is, how do we uncover these associations?

Besides increasing sales profits, association rules can also be used in other fields. In medical diagnosis for instance, understanding which symptoms tend to co-morbid can help to improve patient care and medicine prescription.

Understand Association Rules

Support

This says how popular an itemset is, as measured by the proportion of transactions in which an itemset appears.

the support of {apple} is 4 out of 8

Confidence.

This says how likely item Y is purchased when item X is purchased, expressed as {X -> Y}. This is measured by the proportion of transactions with item X, in which item Y also appears

One drawback of the confidence measure is that it might misrepresent the importance of an association.

This is because it only accounts for how popular apples are, but not beers. If beers are also very popular in general, there will be a higher chance that a transaction containing apples will also contain beers, thus inflating the confidence measure. To account for the base popularity of both constituent items, we use a third measure called lift.

Lift.

This says how likely item Y is purchased when item X is purchased, while controlling for how popular item Y is.

Python Implementation

from mlxtend.frequent_patterns import apriori
from mlxtend.frequent_patterns import association_rules

1- Get data of all the items purchased by invoices or transaction history

Prepare the Data

Implement the Algorithm

Apriori will give Association Rules which have —

Support , confidence , Lift

Also it gives you the the Item 1 that is the purchased item and Item 2 ( can be a group of items bought along with item 1)

Example of a few association rule are :

We can now convert all of them to a dataset and get the associations

To just get the pairings

rule.items_add

rule.items.base

--

--