Customer segmentation is a hot topic for sales, business development and customer success professionals, whether in B2B or B2C.
In a segmentation exercise, customers are divided in subgroups based on their similar preferences, characteristics or purchase behaviour.
A clear and realistic customer segmentation can help companies design differentiated products and services, as well as marketing campaigns and sales activities to better target the specific groups of customers more effectively and capture more of the market. When your customers find the offering or message more relevant to them, they are more likely to be interested in it, which means better outcome and ROI for your business.
There are many quantitative methods that one can use for effective customer segmentation.
Using Clustering algorithm for customer segmentation and profiling
With more monitoring platforms and tracking services become readily available, and the cloud infrastructure become more affordable, organisations are collecting larger volumes of data about their customers, especially in the space of digital products and services, and online marketplace. There is a lot companies can learn from these data, and naturally using them to segment the customers become a popular task in the attempt to better understand and service their customers.
There are different clustering algorithms you can use for unsupervised Machine Learning. This is called “unsupervised” because the data doesn’t tell us which group the customers actually belong to – it is something we want to estimate and generate as an output. K-means is one of the most common unsupervised clustering methods.
Demographics, firmographic, and psychographic data together with purchase behaviour data is typically used in the clustering to form customer profiling, though customers with the same demographic or firmographic characteristics may be allocated to different groups due to their purchase behaviours or other factors. Once you have the segments from the clustering algorithm, you should perform further analysis to understand why clusters are generated and what the business implications are (e.g. what strategies can be used to target each cluster). This will most likely require subject matter expertise, and it may take several iterations to work through different numbers of clusters until you find one that makes most sense to the business.
You can read about a simulated segmentation analysis for Starbucks using k-means here.
Use quantitative market research method (e.g. Conjoint) and derive customer preference
My first job after completing my master’s degree was with a decision behaviour consultancy that specialises in a quantitative methodology called Conjoint Analysis. It is a survey-based method that is rooted in econometrics modelling. The concept is simple – the idea is to ask consumers (or whoever your target sampling audience is) to complete a survey where they need to choose the options they prefer in close-to-realistic decision-making scenarios. Then, regression techniques (e.g. Hierarchical Bayesian) are used to derive their preferences (utilities for the different attributes tested) from their choices and the trade-offs they made. Conjoint is widely used in pricing, portfolio optimisation, product R&D and promotion campaign communication studies.
The popular application of the consumer preferences derived from the conjoint analysis is market simulation. But you can also carry out market segmentation using Latent Class analysis (which is basically using clustering algorithm on respondent preferences), where the result is represented as a collection of homogeneous subgroups where customers in each subgroup have similar preferences for a product or service. This will help companies address the heterogeneity in consumer preferences in the market. It can be used both as exploration or validation of customer segments, which can then serve as the basis for differentiated product positioning or marketing strategies.
Use Machine Learning to predict customer behaviour categories
A more advanced, or “free-styled” way to perform customer segmentation is to use a type of Machine Learning algorithm called classification algorithm to predict and “allocate” customers into groups.
For instance, if your company sells B2B enterprise software, you might want to know which of your customers might churn when their contracts are coming up to expire. This might be because of many different reasons, and it will take a long time and a lot of effort if you were to manually list out all the possible reasons why customers might churn, and come up with the combination of “signals” you need to look for in the data.
Machine Learning algorithms can save you that trouble of manually define who should belong to what groups, even when there are different combinations of characteristics within the same group. What you will need is historical data with customer characteristics (such as their engagement behaviour and product usage data) and the outcome of their contract renewal (churn or renew). When you have a rich historical dataset to feed to the machine (we call this training data), the model can learn from it and extract patterns that capture the relationship between the renewal outcome and the other observed data points. The trained model can then be scaled to a larger or latest dataset to predict unknown (e.g. future) customer behaviour.
In this example with segmentation based on likely customer renewal outcome, you are allocating your customer base into two groups, one that is likely to churn, and the other that is likely to renew. You can then focus the majority of your workforce to work on retaining those that are at risk of churning. In more complex use cases, you can also predict multiple groups instead of just two. For instance, you can segment customer into four groups, those that are likely to expand, shrink, maintain the same level, or churn, and then define different intervention strategies to target each group.
One thing you need to know about this method is that, the example described here is called “supervised learning”, which means the data that you used to train the model will need to contain the actual outcome label, which is the “truth” of which group the customer belonged to. If you don’t know the truth even with historical data, you can only use “unsupervised learning” which is the first method described above.
There are many model classes that can be used for classification, such as Logistic Regression, Random Forest, Support Vector Machine, and Neural Network. You can read more about model selection in this blog where we shared the Predictive Flowchart we created at TrueCue.
Customer segmentation provides a foundational understanding of the heterogeneity among your customer base. There might be other aspects in that heterogeneity that are not captured by the data, such as how customers use your product to solve problems in different use cases, and the difference in price sensitivity and willingness to pay. Ultimately, the understanding of the customers need to be translated to product, sales and marketing strategies and adoption across relevant teams for the business to benefit from the new framework.
Bingqian believes in the power of Analytics and Data Science in uncovering insights and helping to better inform decision making. As a Senior Consultant and Data Science Lead at TrueCue, she enjoys finding solutions for challenges in data consolidation, modelling, visualisation and Advanced Analytics.
She leverages modern technology such as Alteryx, Tableau, DataRobot, and Microsoft Azure Machine Learning, and is one of the 17 Certified Alteryx Experts in the world. Outside of work, she enjoys a wide range of activities, from oil painting, poetry reading, scuba diving, to boxing and krav maga.