RFM Customer Segmentation SuperStore Dataset

3 min readMar 16


Customer segmentation is a process that divides customers into groups based on similar characteristics or behaviors. This helps businesses to tailor their marketing strategies, improve customer experience, and ultimately increase profitability. One of the most popular methods of customer segmentation is RFM analysis, which stands for Recency, Frequency, and Monetary. RFM analysis is a data-driven approach that uses these three factors to classify customers into segments based on their value to the business.

In this article, we will use RFM analysis to segment customers of a fictional superstore in California, which is the state with the most profit. The aim is to understand the purchasing behavior of different customer groups and identify opportunities for targeted marketing campaigns.


The dataset we will be using is the Superstore dataset, which contains sales data of a fictional superstore. The dataset contains information about customers, products, sales, and profits.

RFM Segmentation:

To start our analysis, we will first calculate the RFM scores for each customer.

  • Recency refers to how recently a customer made a purchase. We will calculate the number of days since the customer’s last purchase, and rank them accordingly.
  • Frequency refers to how often a customer makes a purchase. We will calculate the total number of purchases made by each customer.
  • Monetary refers to how much money a customer spends. We will calculate the total amount spent by each customer.

Once we have the RFM scores for each customer, we can group them into segments based on their scores.

After segmenting our customers, we can analyze the purchasing behavior of each segment and tailor our marketing strategies accordingly.

California is the state with the most profit in the Superstore dataset. Therefore, it’s crucial to understand how customers are segmented in this state.

The RFM analysis was conducted on the Superstore dataset, and the results are presented in bellow.

The majority of customers fall into the “About to Sleep” and “Recent Customer” segments, which make up 111 and 74 customers, respectively. The “Champion” and “Loyal” segments consist of 15 and 12 customers, respectively, indicating that only a small percentage of customers are loyal to the Superstore in California.

When it comes to frequency, the “About to Sleep” segment has the highest number of customers, with 111, followed by the “Recent Customer” segment, with 74 customers. The “Champion” and “Lost Customer” segments consist of 37 and 33 customers, respectively, indicating that a small percentage of customers are making frequent purchases.

In terms of recency, the “About to Sleep” and “Lost Customer” segments have the highest number of customers, with 22,507 and 8,680 customers, respectively. This indicates that a large number of customers in California have not made a purchase recently.

Focusing on turning the “Average” customers into “Potential loyalist” and “Loyal” customers, because these segments give us the most profit. Building a good relationship with them and providing personalized offers may help in increasing their loyalty and retention.

We can also rewarding the “Champion” customers with exclusive discounts or loyalty programs as they provide the highest monetary value to the business.

It’s also essential to focus on the “Cannot lose them” segment, as they are spending a lot of money but haven’t come back. By bringing relevant promotions and personalized offers, we may win them back and increase their engagement and retention.

Overall, by focusing on these specific segments and providing personalized and relevant offers, the business can increase customer loyalty, retention, and profitability.

If you are interested the modeling code here :