Join over 2 million students who advanced their careers with 365 Data Science. Learn from instructors who have worked at Meta, Spotify, Google, IKEA, Netflix, and Coca-Cola and master Python, SQL, Excel, machine learning, data analysis, AI fundamentals, and more.
Natassha Selvaraj 24 Apr 2023 7 min read
Market basket analysis is a powerful data science application that improves user experience and encourages purchases, which adds direct business value to companies.
In the past, marketers would often use their intuition when creating product combinations and building marketing strategies. Now that organizations are able to collect and store more data than ever before, they use their findings to target customers and increase sales. They hire data scientists and analysts in marketing teams to make these decisions instead.
In this article, I will explain some of the theory behind market basket analysis and show you how to implement it in Python.
Market basket analysis is used by companies to identify items that are frequently purchased together. Notice, when you visit the grocery store, how baby formula and diapers are always sold in the same aisle. Similarly, bread, butter, and jam are all placed near each other so that customers can easily purchase them together. The technique uncovers hidden correlations that cannot be identified by the human eye by using a set of statistical rules to identify product combinations that occur frequently in transactions.
Apart from market basket analysis, other popular applications of data science in marketing include churn prediction, sentiment analysis, customer segmentation, and recommendation systems.
Market basket analysis is frequently used by restaurants, retail stores, and online shopping platforms to encourage customers to make more purchases in a single visit. This is a use-case of data science in marketing that increases company sales and drives business growth and commonly utilizes the Apriori algorithm.
The Apriori algorithm is the most common technique for performing market basket analysis.
It is used for association rule mining, which is a rule-based process used to identify correlations between items purchased by users.
Let’s explore the process through an example of items most frequently bought together in a given store:
Most store customers have purchased popcorn, milk, and cereal together. Therefore, is a frequent itemset as it appears in a majority of purchases. So, if a person grabs popcorn and milk, they will also be recommended cereal.
According to the Apriori algorithm, a subset of the frequent itemset is also frequent. Since is a frequent itemset, this means that , , and are also frequent. Due to this, if a customer only goes for popcorn, they will be recommended both milk and cereal as well.
The Apriori algorithm has three main components:
You can think of these as metrics that evaluate the relevance and popularity of each item combination.
Let’s illustrate. The baskets below contain items purchased by four customers at a grocery store:
Here is a tabular representation of this purchase data: