Cluster analysis: Insights into target groups, markets & products

Appinio Research · 06.12.2023 · 14min read

Statistical data analysis using cluster analysis

Gaining profound insights into target groups and markets , fortifying customer loyalty, propelling developmental endeavors, and evaluating the risks associated with a product launch – all these feats become attainable through a great tool: cluster analysis. This method unravels patterns and correlations embedded within vast datasets.

In the following article, we delve into the essence of cluster analysis, tracing its origins, highlighting its merits for both market research and companies, delineating the prerequisites for a successful analysis, and uncovering the untapped potential that cluster analysis holds for optimizing your marketing strategies.

What is a cluster analysis?

Cluster analysis emerges as a versatile powerhouse in the realm of market research. 

This statistical method enables the identification of patterns and clusters within data, where shared characteristics or properties bind elements together. 

Homogeneous groups, referred to as 'clusters,' encapsulate akin data points or objects. 

Through this segmentation, companies can acquire specific insights into their customers, products, or markets, subsequently translating these revelations into strategic marketing initiatives.

When does cluster analysis add value?

Cluster analysis emerges as a valuable asset for companies seeking profound market and target group insights. 

Its utility becomes most pronounced when companies grapple with a substantial volume of customers or products, and categorizing them into distinct groups is a must. 

This segmentation facilitates a more focused approach to diverse customer segments, fostering the development of personalized marketing strategies and target group approaches. Ultimately, cluster analysis serves as a catalyst for enhancing competitiveness, enabling companies to deploy resources more efficiently and make informed decisions in marketing and product development.

A brief history of cluster analysis in market research

The inception of cluster analysis dates back to the 1930s. However, it was during the 1950s and 1960s that diverse approaches to cluster analysis took shape, capturing the imagination of both market research and marketing spheres. 

Recognizing its potential, companies embraced this analytical tool to segment customer data and pinpoint target groups , paving the way for tailored approaches and optimized marketing strategies.

The advent of computers in the 1980s marked a turning point, rendering cluster analysis more accessible and efficient. 

Today, it enjoys unprecedented popularity. 

Technological strides, including big data and advanced statistical software like SPSS, coupled with real-time, app-based data analysis, have elevated cluster analysis to an indispensable status for business success. 

It serves as a vital element for understanding market segments, identifying customer needs, and formulating competitive advantages.

Applications of cluster analysis

Cluster analysis finds its foothold not only in market research but extends its reach to diverse fields, showcasing its versatility in classifying customers or data into homogeneous target groups. 

This method proves invaluable for identifying patterns and correlations, paving the way for the development of personalized marketing strategies.

Beyond market research, cluster analysis finds application in various sectors:

  • The social sciences leverage cluster analysis to segment population groups and unveil behavioral patterns. 
  • In healthcare , it aids in crafting personalized treatment plans for patient groups. 
  • The finance domain benefits from portfolio optimization and risk minimization. 
  • In biology , the tool is instrumental in unraveling genetic patterns and family trees, contributing to the exploration of specific causes of diseases.
  • In the realm of mobility planning and logistics , it facilitates the examination of traffic flows, enabling the planning of more efficient routes.

The advantages and disadvantages of cluster analysis

Cluster analysis emerges as a transformative tool, propelling companies to new heights by offering a plethora of opportunities.

  • Recognizing patterns and structures : In the vast landscape of data, cluster analysis unveils hidden patterns and structures, providing invaluable insights.
  • Target group identification and segmentation : It facilitates the precise identification and segmentation of t arget groups , laying the groundwork for targeted marketing strategies.
  • Personalizing marketing strategies : With a focus on personalization, cluster analysis empowers companies to tailor marketing strategies, thus enhancing overall efficiency.
  • Optimizing products and services : Companies can refine their products and services by leveraging the findings of cluster analysis, ensuring alignment with customer needs.
  • Informed decision-making : Serving as a cornerstone for corporate strategy and marketing plans, cluster analysis provides a sound basis for informed decision-making.

Disadvantages

However, like any method, cluster analysis has its drawbacks:

  • Subjectivity in cluster selection : The selection of clusters and determining their number can be subjective, introducing an element of interpretation.
  • Resource-intensive for large datasets : Dealing with large datasets can be research-intensive and resource-consuming, potentially slowing down the analysis process.
  • Impact of outliers : Individual data points acting as outliers may adversely affect clustering accuracy.
  • Assumption risks : Analyses are susceptible to inaccuracies if built on incorrect assumptions regarding data classification.
  • Risk of over-clustering : There is a risk of creating an excessive number of clusters, potentially leading to representations that are no longer truly reflective of the underlying data.

Navigating these considerations judiciously allows companies to harness the full potential of cluster analysis while being mindful of its limitations.

Clusters: a robust foundation for further analyses

The outcomes of cluster analysis serve a dual purpose. Firstly, they provide an excellent launchpad for targeted marketing initiatives. Secondly, these clusters form a solid groundwork for further investigations through regression analysis , factor analysis , or TURF analysis .

  • Regression Analysis This method explores relationships between individual variables, shedding light on the effectiveness or inefficacy of marketing activities. It also unravels intricate connections between distinct segments.
  • Factor Analysis Aimed at simplifying complex datasets, factor analysis sifts through the intricacies to unveil the most crucial factors. This process identifies additional similarities between objects within a cluster, enhancing the depth of understanding.
  • TURF Analysis Leveraging available data, TURF analysis scrutinizes the outcomes of marketing activities, delineating which product and marketing mix yields the most significant reach among customers.

By leveraging clusters as a robust database, companies can not only fine-tune their marketing strategies but also delve deeper into the intricate dynamics and relationships within their customer segments.

Essential prerequisites for effective cluster analysis

The efficacy of a cluster analysis hinges on the foundation of robust data. 

For this purpose, it is imperative to normalize or scale the data, ensuring comparability and facilitating the classification into clusters. 

The clusters themselves necessitate distinctly recognizable characteristics or variables. Equally pivotal is the selection of an appropriate algorithm and analysis software, such as SPSS, coupled with the critical task of determining the optimal number of clusters.

Key requirements include:

  • Normalized or scaled data A prerequisite for meaningful comparisons and cluster classification.
  • Clearly defined characteristics or variables Essential for the identification and differentiation of clusters.
  • Appropriate algorithm The right algorithm is key in ensuring the accuracy and relevance of the analysis.
  • Suitable analysis software Leveraging advanced tools like SPSS enhances the efficiency and accuracy of the cluster analysis.
  • Optimal cluster number determination Striking the right balance in determining the number of clusters is crucial for precision.

Above all, a clear comprehension of the analysis objectives is foundational. This understanding is essential for interpreting clusters meaningfully and extracting strategic insights that can steer informed decision-making.

Methodologies in Cluster Analysis

Cluster analysis employs various methodologies, or methods, depending on the objectives and data scenarios. Among the plethora of approaches, two stand out as the most common: hierarchical cluster analysis and k-means.

  • This method constructs a tree structure encompassing all data points, ranging from individual data points to larger clusters. Clusters can manifest at different hierarchical levels, with two fundamental directions: agglomerative (bottom-up) and divisive (top-down).
  • Provides a comprehensive perspective, offering insights into hierarchical relationships among data points.
  • An iterative technique that categorizes data points into predefined k-clusters, determined before the analysis. The objective is to group similar data points within the same clusters, ensuring each cluster exhibits similar characteristics.
  • Facilitates the identification of patterns and extraction of trends by grouping data points based on their similarities.

These methods offer distinct advantages based on the nature of the data and the analytical objectives. Selecting the most appropriate methodology is pivotal to the success and relevance of the cluster analysis.

Example of different group distributions as a result of a cluster analysis

Real-world application of cluster analysis

Consider a scenario where a company aims to connect with younger target groups , seeking a profound understanding of their needs to tailor individualized marketing initiatives. 

To achieve this, the company conducts a survey gathering demographic data, including age, gender, place of residence, interests, and more. 

The collected data undergoes analysis through cluster analysis to categorize customers into distinct groups.

  • Data collection Demographic information such as age, gender, location, and interests is gathered through a comprehensive survey.
  • Cluster analysis Leveraging cluster analysis, the collected data is meticulously analyzed to identify patterns and commonalities. This results in the segmentation of customers into distinct groups based on shared characteristics.
  • Strategic insights Armed with the segmented customer groups, the company gains valuable insights into the unique needs and preferences of younger target demographics.
  • Targeted marketing measures With a nuanced understanding of each customer cluster, the company can craft targeted marketing strategies tailored to the specific characteristics and preferences of each group.
  • Enhanced engagement By aligning marketing measures with the identified customer clusters, the company maximizes its outreach and engagement with the younger target groups.

In this example, cluster analysis serves as a powerful tool, enabling the company to not only comprehend the diverse needs of younger demographics but also to strategically tailor marketing initiatives, fostering a more personalized and impactful connection with their target audience.

Cluster analysis in nine steps

Cluster analysis stands out as an invaluable tool for extracting coherent patterns and groups from vast datasets, offering profound insights for refining marketing strategies and targeting specific audience segments. 

Here is an overview of the typical nine-step process in cluster analysis:

  • Data preparation Before diving into the analysis, ensure the data, whether customer information or product characteristics, is meticulously collected, complete, and standardized.
  • Variable selection Identify the relevant variables and characteristics for the analysis. These may include demographic data, purchasing behavior, or product features.
  • Data normalization Normalize the data to enhance comparability, allowing for better scaling and extraction of different characteristics across units or value ranges.
  • Method selection Choose the appropriate cluster analysis method based on the specific objectives and data characteristics.
  • Number of clusters determination Decide on the optimal number of clusters to divide the data. This can be achieved through visual inspection or statistical methods like the elbow criterion.
  • Implementation of cluster analysis Assign each data point to a specific cluster using statistical software such as SPSS.
  • Results interpretation Analyze the formed clusters to identify distinctive features and differences between groups. Extract marketing strategies or product enhancements based on these key characteristics.
  • Validation and implementation of results Critically review and validate results using internal or external validation methods. Implement targeted marketing strategies or adjust products and services to cater to the unique needs of individual clusters.
  • Monitoring and adaptation Continuously monitor the effectiveness of implemented measures and adapt strategies as needed. Employ cluster analysis as an ongoing tool to identify market changes and evolving customer behaviors, ensuring flexibility in strategy adjustments.

Why companies should use cluster analysis

Cluster analysis emerges as a great tool, empowering market researchers and companies to distill intricate data into lucid and interpretable patterns. 

By categorizing customers or products into clusters, this method facilitates the precise identification of target groups , paving the way for the development of tailor-made marketing strategies. This strategic approach secures you a competitive edge.

Here's why companies should leverage cluster analysis:

  • Precision in target group identification Cluster analysis allows for the precise identification of target groups , enabling companies to understand and cater to the unique needs of diverse customer segments.
  • Tailor-made marketing strategies By grouping customers or products into clusters, companies can develop tailor-made marketing strategies that resonate with the specific characteristics and preferences of each cluster.
  • Enhanced competitive advantages The insights derived from cluster analysis contribute to the formulation of strategies that enhance competitive advantages. This is especially crucial in navigating today's highly competitive business landscape.
  • Improved market understanding Companies gain a nuanced understanding of market diversity, empowering them to navigate and respond effectively to the dynamic landscapes of various markets.
  • Data-driven decision-making Harnessing cluster analysis enables companies to make data-driven decisions, ensuring strategic choices are informed and optimized for success.

In a business world marked by intense competition, the adoption of cluster analysis emerges as an invaluable asset, offering a pathway to better comprehend markets, meet customer needs, and make informed strategic decisions.

Interested in running your own study?

In our dashboard, you will find questionnaire templates that you can customize and get the insights you need to bring your brand to the next level.

Register now EN

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

Targeted Advertising Definition Benefits Examples

25.04.2024 | 37min read

Targeted Advertising: Definition, Benefits, Examples

Quota Sampling Definition Types Methods Examples

17.04.2024 | 25min read

Quota Sampling: Definition, Types, Methods, Examples

What is Market Share? Definition, Formula, Examples

15.04.2024 | 34min read

What is Market Share? Definition, Formula, Examples

Cluster Analysis

Cluster analysis

Quick definition: Cluster analysis is a form of exploratory data analysis in which observations are divided into groups that share common characteristics. Those groups are compared and contrasted with other groups to derive information about the observations.

Key takeaways:

  • Cluster analysis allows organizations to better understand their customers by identifying individuals with similar traits, which can inform how the organization communicates with those customers.
  • There are five main clustering approaches. The most common are K-means clustering and hierarchical, or hierarchy, clustering. The clustering approach an organization takes depends on what is being analyzed and why.
  • To ensure accurate cluster analysis, choose helpful variables (behavior, geography, demographics, etc.) to evaluate the observations, cluster the observations into the right number of groups, and create clusters with high intra-cluster similarity and low inter-cluster similarity.

The following questions were answered in an interview with John Bates, the director of product management for Predictive Marketing Solutions and Analytics Premium for Adobe Marketing Cloud.

What is cluster analysis? What is the purpose of clustering? What are the different types of clustering? What are the characteristics of a good cluster analysis? How do you perform cluster analysis? What do you do with the results of a cluster analysis? How do you make sure your cluster analysis is accurate? Why is cluster analysis important for business strategy? How do you make sure cluster analysis is accurate? How often do organizations update clusters?

What is a cluster analysis?

Cluster analysis is a type of unsupervised classification, meaning it doesn’t have any predefined classes, definitions, or expectations up front. It’s a statistical data mining technique used to cluster observations similar to each other but unlike other groups of observations.

An individual sorting out the chocolates from a sampler box is a good metaphor for understanding clustering. The person may have preferences for certain types of chocolate.

When they sift through their box, there are lots of ways they can group that chocolate. They can group it by milk chocolate vs. dark chocolate, nuts vs. no nuts, fruit

filling, nougat, etc.

The process of separating pieces of candy into piles of similar candy based on those characteristics is clustering. We do it all the time.

What is the purpose of clustering?

The general purpose of cluster analysis in marketing is to construct groups or clusters while ensuring that the observations are as similar as possible within a group.

Ultimately, the purpose depends on the application. In marketing, clustering helps marketers discover distinct groups of customers in their customer base. They then use this knowledge to develop targeted marketing campaigns.

For example, clustering may help an insurance company identify groups of motor insurance policyholders with a high average claim cost.

The purpose behind clustering depends on how a company intends to use it, which is largely informed by the industry, the business unit, and what the company is trying to accomplish.

What are the different types of clustering?

There are five different major clustering approaches:

  • Partitioning algorithms
  • Hierarchy algorithms
  • Density-based algorithms
  • Grid-based algorithms
  • Model-based algorithms

The most common clustering approaches are partitioning and hierarchy algorithms.

The main difference between the two is that partitioning algorithms look to create various partitions and then evaluate them by some criterion, while hierarchy-based algorithms decompose, or split information, based on a criterion.

K-means clustering is probably the most common partitioning algorithm. It’s generally used when the number of classes is fixed in advance. An analyst tells the algorithm how many clusters they want to divide the observations into.

Then each cluster is represented by the center of the cluster, or the mean. It's an efficient option, but it does have some weaknesses. It’s only applicable when the mean is defined and the number of clusters is determined in advance.

It also doesn't deal well with outliers, so if there are observations that are very different from the rest, K-means isn’t the best option.

Another type of algorithm is called expectation maximization (EM). EM is a type of partitioning algorithm, but it's model-based. It works similarly to K-means.

However, instead of assigning examples to clusters to maximize that difference in means or the variables, the EM clustering over the variables computes the probability of cluster memberships, or the likelihood that a single observation falls into a particular cluster.

It uses probability distributions to calculate that number.

The great thing about EM is that it's not mutually exclusive. A customer can have the probability of being associated with multiple clusters.

They will typically get assigned to the one with the highest probability, but they may also have a lot of characteristics or traits with another cluster.

The purpose of hierarchical clustering is to create a hierarchy of groups. This can either be done with an agglomerative process, which starts with each observation in its own cluster and then pairs up similar observations in multiple levels, or a divisive process.

This starts with all the observations in a single cluster and then breaks them into different groups.

A hierarchy cluster is like a data visualization tree. You can see how people start together and then divide out based on different criteria. Hierarchical clustering is great for the end user to be able to see those relationships.

What are the characteristics of a good cluster analysis?

A good clustering method will produce high-quality clusters, which means there is high similarity between observations in a single cluster and low similarity between observations in different clusters.

The quality of the clustering result depends on both the similarity measure used by the method and its implementation. The quality is also measured by the method’s ability to discover some or all hidden patterns that may exist within the data.

A lot of this is evaluated using what’s called a “distance.” Clustering algorithms use a distance measure or metric to determine how to separate observations in the different groups.

The most common one is called Euclidean distance, which shows how far one center of a cluster is from another center of a cluster, but there are many options.

A distance measure often shows how close an observation is to the cluster's mean, or average value, and identifies the cluster's shape.

What are the disadvantages of cluster analysis, and how can companies avoid problems?

Cluster analysis in marketing is an exploratory technique. It's not about making predictions.

In the case of expectation maximization, given the algorithm, it might look at the probability distribution of the data and the probability of assignment to a cluster. That said, it's not making any predictions regarding what those people are likely to do next.

All EM is really doing is helping make sense of data across lots of different variables for a given observation. Companies can only look at a couple of data sets simultaneously and see patterns.

These models are helpful for evaluating lots of data to identify those patterns and then group people who are similar to one another across those traits.

The advantages are that it helps in exploration. It helps inform strategy—how a company might think about their marketing campaigns or make business decisions—but it’s not the end.

Cluster analysis also looks only at known customers. When a new customer begins to interact with a business and the business does not have all the necessary data yet, the customer is an unknown quantity.

They haven't been authenticated, so the company has very little information about them (for instance, where the customer lives). A cluster analysis is static to the assignment at the time and only pertains to the data that’s put into it.

It’s important to regularly re-evaluate clustering and re-apply analysis. If new data comes in, it should be incorporated into the analysis. It’s important never to get too fixated on individual cluster assignments.

Allow clusters to be fluid. And remember to evaluate how customers may move between clusters based on certain interactions they have with the business.

How do you perform cluster analysis?

The first step of cluster analysis is usually to choose the analysis method, which will depend on the size of the data and the types of variables.

Hierarchical clustering, for example, is appropriate for small data sets, while K-means clustering is more appropriate for moderately large data sets and when the number of clusters is known in advance.

Large data sets usually require a mixture of different types of variables, and they generally require a two-step procedure.

After you decide on what method of analysis to use, start the process by choosing the number of cases to subdivide into homogeneous groups or clusters. Those cases, or observations, can be any subject, person, or thing you want to analyze.

Next, choose the variables to include. There could be 1,000 variables, or even 10,000 or 25,000. The number and types of variables chosen will determine what type of algorithm should be used.

Then decide whether to standardize those variables in some way, so that every variable contributes equally to the distance or similarity between the cases. However, the

analysis can be run with both standardized and unstandardized variables.

Each analysis method has a different approach. For K-means clustering, select the number of clusters, then the algorithm iteratively estimates the cluster means and assigns each case to the cluster for which its distance to the cluster mean is the smallest.

For hierarchical clustering, choose a statistic that quantifies how far apart or similar two cases are.

Next, the algorithm selects a method for forming the groups. Finally, the algorithm determines how many clusters are needed to represent the data. It looks at how similar clusters are and splits.

What do you do with the results of a cluster analysis?

Depending on the clustering method, there's usually an associated visualization. That's very common for investigating the results. In the case of K-means, it’s common to use an X, Y axis that shows the distance of groups of observations.

By using that type of visualization, those groupings become very clear. In the case of hierarchical clustering, visualization called a dendrogram is used, which shows the splits in the cut tree.

Why is cluster analysis important for business strategy?

Cluster analysis can benefit a company in multiple ways, including how they market their products.

It can affect whom they market those products to, what retention and sales strategies might be employed, and how they might evaluate prospective customers.

They can cluster current customers and determine their lifetime value relative to their propensity for attrition, and that can inform how they communicate with different customers and how to identify new high-value customers.

How do you make sure your cluster analysis is accurate?

When looking at the accuracy of a cluster, there are three important factors: cluster tendency, number of clusters, and clustering quality.

Before evaluating cluster performance, make sure the data set you’re working with has clustering tendency, which means that it doesn’t contain uniformly distributed points.

For example, it doesn’t benefit the analysis to choose a variable like “species,” because every observation will be the same. There are statistical methods for assessing clustering tendency.

Number of clusters is a required parameter for K-means clustering, but it’s useful for evaluating accuracy in other methods as well. By identifying how many clusters a team intends to work with, they can group observations in the best way to derive helpful insights.

Too few clusters means putting together observations that aren’t similar enough to take action, while too many clusters will divide your observations up too much to be useful.

Clustering quality looks at the level of similarity within a cluster and among separate clusters.

There are multiple methods to ensure a high clustering quality, including the adjusted rand index, the Fowlkes-Mallows scores, mutual information-based scores, and homogeneity completeness.

How often do organizations update clusters?

It often depends on the use case. A high-tech retailer like Best Buy might use clusters at the highest level to align the entire enterprise on personas.

Every employee, from those in the call centers to the individuals in the stores themselves, can look at every customer and classify them into the cluster or persona they most align with.

The company won’t change those clusters very often because they inform a higher-level strategy across the entire business.

But then, within certain departments, you might have micro clusters. Given one of those higher-level clusters, companies may want to cluster individuals more often because they are moving through different life cycle stages of the sales process.

Once they’ve clustered their customers, the cluster becomes stale, so companies might re-cluster those individuals depending on how long the sales cycle is.

People also view

Other glossary terms

Data-Driven Decision Making Data Visualization Market Segmentation Descriptive Analytics Correlation Analysis

Related Adobe products

Adobe Analytics Adobe Audience Manager Adobe Sensei Experience Platform Adobe Target

Cluster analysis card image

importance of cluster analysis in marketing research

Cluster Analysis for Marketers: The Ultimate Guide

importance of cluster analysis in marketing research

When you want to analyze your marketing data, it is simply not realistic to look at each customer separately. True, it is beneficial to collect and store rich data for each customer; however, it is impossible to organize and communicate analyses that look at thousands or millions of individual customer records at the same time. Making decisions at a strategic level would be impractical.

Our brains simply cannot process information at such a granular level. At the same time, we know that we don't want to oversimplify it down to a one-size-fits-all approach. There has to be a middle ground where the customer’s voice is adequately heard,  even if  some segmentation of the user base is required.

In fact, there is a way to elegantly approach the challenge of segmenting customers. It is called cluster analysis, and it is one of the most accessible and explainable ways to apply machine learning on marketing data.

Why cluster analysis ?

Let's take a step back before diving into this technique. It’s important to understand how cluster analysis differs from other approaches. If the goal is to segment customers, why can't you do this segmentation manually?

Well, you can. In fact, if you work with web analysis tools like Google Analytics , you are probably used to manually defining traffic and user segments of interest in order to keep the analysis focused on the right places. 

This approach is very common and  for good reason, but it has its limitations.

While it can be  effective when working with a small number of user dimensions, it is not hard to imagine how it cannot easily scale in the presence of a high number of user attributes. Luckily, when the human brain reaches its limit, advanced analytics and machine learning can provide solutions.

Prepare your data first

Cluster analysis is a fascinating technique and one of the top advanced analytics methods used in Marketing.

❗To prepare the foundation of your organization to work effectively with clustering you'll need to carefully prepare your data .

You'll want to make sure your basic digital marketing reporting needs are well taken care of. Having a solid automated data and reporting pipeline in place will free up resources, reduce human errors, and, most importantly, improve data quality.

The quantity and diversity of data also play a key role. The reason for this is most of the advanced marketing analytics techniques, such as clustering, perform significantly better in the presence of larger volumes of granular data collected from a variety of sources.

The way you handle, process, and utilize your data affects your company's position on the analytics maturity curve. The higher you climb the curve, the more advanced your analysis methods are and the more insights you get from raw data.

Improvado's analytics maturity model

🚀 Discover how to move forward on the analytics maturity path with our extensive guide. 🚀

Improvado can help with all of these aspects of your preparation before you dive into advanced marketing analytics, from automating your marketing reports to collecting and storing granular level data.

Use Cases for Marketing

Clustering for customers is one of the most widely-known domains for cluster analysis applications. It helps marketers group together similar customer stories.  Once you become familiar with the technique, there is no shortage of other marketing-related fields where you can meaningfully apply it .

Customer use case

You can cluster customers based on the many types of characteristics available about them and their behavior. For example, clustering can be based on:

  • Customer browsing activity
  • Customer demographics
  • Recency, frequency, and monetary value of a customer
  • Items bought by a customer
  • Offline customer behavior

Product use case

Another interesting use case is product clustering, which can be based on attributes of products such as:

  • When the product was purchased
  • Who purchased the product
  • In which store the product was purchased 

SEO use case

Likewise, say for SEO keywords, you can apply cluster analysis if you have available data about:

  • Keyword rankings
  • Difficulty score
  • Authority score   

By the way, we've built an SEO dashboard template that can help you better track your content marketing metrics, including sessions, visits, and bounce rate, and more.

SEO dashboard for cluster analysis

How Clustering works

The basic concept.

‍ Now that you have seen how useful clustering is in a marketing context, it's time to gain some intuition on how it works. Incidentally, if you have been wondering  how a machine learning technique can work in practice for marketing, this will give you a great sense. In fact, clustering is considered among the most widely-used, unsupervised machine learning techniques. 

Why unsupervised? Because there isn't any ground truth that we want the machine to learn or predict, instead we want the data itself to reveal the natural structures within it. Sound confusing? It's not. To make the concept clearer, let's look at a simple example.

A simple example

Imagine you are in charge of a T-shirt company who wants to customize the fit of T-shirts for its customers. You have sample data regarding the height and weight of your customers. This is how the data looks when plotted in two dimensions:

What the clustering algorithm does is label each customer—represented by a point in the graph—according to the optimal cluster that it can be matched to. The key is to make clusters as homogeneous as possible. 

Key definitions

How are the clusters determined? The idea is to form clusters in a way that maximizes the similarity between the points of each group. “How is similarity defined?” you might ask. It's expressed as the distance between each possible pair of points. 

‍ How can you measure that distance? This is where the  Pythagorean theorem comes in (you might have heard of it in geometry class). If you have the x and y values of two points —in our example, the weight and height measurements of two customers— you can calculate the distance between them. This simple calculation, based on this classic theory, is the foundation of the clustering algorithm.

A marketing example

Hopefully by now, the information in this article has helped you to start connecting the dots.

Next step, forget about heights and weights and think about some more realistic scenarios. While with two variables clustering analysis might seem easy and intuitive, this is not the case when you start adding customer attributes.  If you move beyond the three attributes threshold, it's no longer possible to visualize the data.

Instead of measurements like height and weight, you now have variables such as customer income, age, purchase value, and so on. You can calculate the distances in the same away as in the simple example above until you find the optimal clusters.

K-means algorithm

This last step however cannot happen in one go. It should happen iteratively by following one of the several clustering algorithms available. The most common one is called k-means, which, as we 'll see, comes with some favorable properties.

Once the algorithm determines the optimal clusters, the ball is back in your court.

The marketer's role

As a marketer, you need to use your domain knowledge, intuition, and experience to give descriptive names to the clusters produced by the algorithm and, of course, ensure that the outcomes make sense from a practical and business standpoint.  You might want, for example, to experiment with adding or removing  one or more of the initial attributes and then rerun the algorithm to check if it produces more meaningful clusters.

Applying the clustering technique

  • To prepare for clustering, you'll need to have granular level data for each customer, each product, etc. This technique simply doesn't work with aggregate data.
  • Ideally, if your data lives in different places you’ll want to collect them and store them in a data warehouse such as BigQuery, Redshift, or Snowflake for easy access. Remember that Improvado is here to help you with this.
  • Before applying the technique, you'll need to make sure that the data is numeric or converted into a numeric form so that the mathematical distances can be calculated.
  • You might also need to normalize the data of the various attributes if they are expressed in different scales. One way to do this is by converting the values of attributes in such a way that they range between zero and one, while still keeping all their original properties.

Once the data is ready from a preprocessing point of view, there are a few options as to how to apply the algorithm:

  • If you have data scientists on your team, they can use open source tools such as the programming language R  or Python for such tasks.
  • SaaS and other analytics tools like Tableau have integrated functionality to allow users to perform clustering in a drag and drop fashion.
  • With the right add-on packages, it is also possible to carry out clustering in Excel. 
  • These days, another very convenient way to do this is via BigQuery, especially if you are familiar with SQL syntax. Implementation of clustering can be accomplished within a few lines of SQL code with the option to immediately  visualize results.

Cluster analysis in practice

The image below shows how the outcome of a cluster analysis might look like in practice.  This particular example is from Tableau, which provides a built-in function for clustering. A large number of products have been grouped into three distinct clusters, based on their sales value and profit ratio. 

The clustering algorithm could have included many more variables. But even with just these two, the result of the analysis can be really informative. For instance, if you are in charge of marketing and product strategy you now have a data-driven way to prioritize the products based on which “performance” cluster they belong to -notice also the presence of some outliers that might require your special attention!

Clustering, despite its merits, is not the perfect solution for all segmentation use cases. Here are some pros and cons of clustering to keep in mind: 

  • It is a very interpretable technique and is easy to visualize.
  • It is efficient to implement and can easily scale to large data with millions of records. 
  • It is dynamic. The definitions of clusters evolve as data changes.
  • It can be used as a data exploration technique to better understand data before making decisions.
  • The cluster analysis result is not deterministic, meaning that different executions of the algorithm might return different results. 
  • With k-means clustering, the marketer must predefine the number of clusters, which is not always an easy, straightforward decision.
  • There is some preprocessing in the data that needs to be done before applying the technique, as discussed in the requirements section.

Great, now that all the steps have been followed and some interesting clusters have been produced— what’s next?

Well, there are many options depending on the exact use case.

For clustering of customers and prospects, you can use the clusters to

  •  customize your re-targeting and re-marketing strategies
  •  better adjust promotional and other types of marketing messages
  •  customize the product for the various personas to better fit their needs
  •  personalize the website design and UI. 

When clustering is used on the product level, it is possible to better capture cross- and up-selling opportunities between the different product clusters.

Clustering is a perfect fit for marketing. It reveals the natural structure in marketing data. It is a great tool for data exploration and it is relatively easy to explain and visualize. It is also one of the most accessible machine learning techniques for marketing. It is very effective in clustering customers, products, keywords, ad groups— you name it!

Our recommendation:

Check out The Best Marketing Analytics Tools & Software for 202‍

Best agency management software for marketing agencies

Marketing “Middleware” Demystified

The Best Marketing Analytics Tools & Software for 2023

Data-Driven Marketing 101: Concept, Benefits, and Pitfalls Clarified

importance of cluster analysis in marketing research

500+ data sources under one roof to drive business growth. 👇

importance of cluster analysis in marketing research

Unshackling Marketing Insights With Advanced UTM Practices

importance of cluster analysis in marketing research

Improvado Labs: experience the latest marketing analytics technology

importance of cluster analysis in marketing research

Im provado - AI-powered marketing analytics & intelligence

From data to insights, automate and activate your marketing reporting with Al.

G2 Crowd logo

From the blog

Google Cloud is an Improvado partner.

San Diego | Headquarters

3919 30th St, San Diego, CA 92104

San Francisco

2800 Leavenworth St, Suite 250, San Francisco, CA 94133

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

importance of cluster analysis in marketing research

Home Market Research Research Tools and Apps

Cluster Analysis: What it is & How to Use It

Bucket research data into groups to make statistical inferences with cluster analysis. Learn how to use the method with its different types.

Data is imperative for brands and organizations to derive inferences and draw conclusions into the minds of customers. Cluster analysis is a critical component of data analysis in market research that aids brands with deriving trends, identifying groups among various demographics of customers, purchase behaviors, likes and dislikes, and more. 

This analysis method in the market research process provides insights to bucket information into smaller groups that help understand how different groups of individuals behave under similar circumstances. Various organizations and researchers can qualify clusters into varied categories depending on pre-defined criteria of what makes sense of a cluster, but the underlying data analysis theme is similar.

Content Index

What is cluster analysis?

  • Hierarchical clustering

Centroid-based clustering

Distribution-based clustering, density-based clustering, cluster analysis examples.

  • Cluster Analysis with QuestionPro

Cluster analysis is a statistical method in research that allows researchers to bucket or group a set of objects into small but distinct clusters that differ in characteristics from other such different clusters. The underlying theme in exploratory data analysis helps brands, organizations, and researchers derive insights from visual data to spot trends and validate hypotheses and explicit assumptions. 

This analysis method in research is commonly based on statistical data analysis used in varied fields, including pattern recognition, machine learning, insights management in market research, data scrubbing, bioinformatics, and more. 

The objective of cluster analysis is to find groups of objects with distinct behavioral changes but where the underlying characteristics and the things are in the same control group. An excellent example of this research method is banks using qualitative and quantitative data to plot trends in claims processing among clients. Using cluster analysis helps them conclude fraudulent claims and better understand consumer behavior .

Discover a wealth of insights in our latest article, showcasing diverse examples of qualitative data in education .

Cluster Analysis Methods 

Cluster analysis helps researchers and statisticians to make a more profound sense of data and make better decisions. While the data can be a part of qualitative research or quantitative research , data analysis is still conducted in a research platform where the data is plotted on a graph. However, as mentioned above, various cluster analysis methods are used to suit research needs.

However, it is essential to note that the clustering method needs to be chosen experimentally unless there is mathematical reasoning to go with a specific manner. Let us look at the most commonly used cluster analysis methods.  

Cluster Analysis Methods

Hierarchical clustering or connectivity-based clustering analysis  

Hierarchical clustering or connectivity-based clustering analysis is the most commonly used method in cluster analysis. In this method, data that showcase similar components are grouped to form a cluster.

These clusters are then correlated to other sets that show identical properties to form other clusters. The central premise of this method in survey research is that objects closer are much more related than objects farther apart.

The other method in hierarchical clustering is the divisive method, where you start with a set of data and then divide them into smaller clusters of similar information. In this method, linkage criteria between clusters are better defined to understand the distance between clusters and their relation. It is important to note that there is no single data partitioning in this analysis model. 

In this clustering method, clusters are formed but are defined by a single central vector point. Using the K-means method clustering algorithm, a central point is found on the axis with a defined objective. Then smaller clusters are connected to this central such that the distance between the clusters and this central point is minimized. 

A drawback of this cluster analysis technique is that the number of clusters, k-clusters is to be defined right at the outset, limiting data analysis and representation. 

The distribution-based clustering analysis method groups data into objects of the same distribution. This method is the most widely used statistical analysis method . This method’s distinct characteristic is simple random sampling to collect sample objects from a distribution.

This model works best when there is a need to display a correlation between attributes and objects. However, the drawback of this model is that since objects are grouped based on predefined attributes, there could be an element of bias in the clustering since each object must match a distribution.  

The density-based clustering method is the fourth commonly used cluster analysis technique, where clusters are defined based on density compared to the overall data set. The objects in the sparse areas are noise and border points, as they typically separate clusters on the graphical representation.

DBSCAN is the most commonly used density-based clustering method. However, a drawback of this method is that a drop in density is required to showcase the difference between two clusters, which often feels unnatural.

Cluster analysis is a definite benefit, and it is widely used across industries, functionalities, and the research field. To better depict the usefulness of cluster analysis in research , let us look at the bottom two examples. 

Cluster analysis in retail marketing

Brands traditionally use cluster analysis to make sense of purchase behavior research and trends by using demographic segmentation among their customer base. A few factors usually considered are geographical location, sex, age, annual family income, etc.

These parameters highlight how different consumer groups make other purchase decisions; hence, retail giants use this data to draw parallels on how to market to such audiences. This also helps in maximizing the ROI on spending while reducing customer churn .  

Cluster analysis in sports sciences

Another everyday use case of cluster analysis is in the field of sports. Data scientists, researchers, doctors, team management, scouts, etc., look at how similar players fare in different scenarios and how effective they are in their sport. Players are bucketed into body type, age, position, and similar criteria to check their effectiveness. 

Cluster Analysis with QuestionPro 

Looking at the correct data and analyzing it is highly beneficial for researchers and brands. Using a mature research platform like QuestionPro allows you to collect research data and helps you run advanced analysis within the tool to give you the insights that matter. 

Leveraging QuestionPro , it is possible to understand your customers and other research objects better and quickly make decisions that matter. Leverage the power of the enterprise-grade research suite today!

LEARN MORE         FREE TRIAL

MORE LIKE THIS

NPS Survey Platform

NPS Survey Platform: Types, Tips, 11 Best Platforms & Tools

Apr 26, 2024

user journey vs user flow

User Journey vs User Flow: Differences and Similarities

gap analysis tools

Best 7 Gap Analysis Tools to Empower Your Business

Apr 25, 2024

employee survey tools

12 Best Employee Survey Tools for Organizational Excellence

Other categories.

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Cluster Analysis Guide with Examples

Explore the power of cluster analysis with our comprehensive guide. Learn the definition, types, and examples of this statistical method to gain insights into complex relationships in your data.

Table of contents

  • What is Cluster Analysis?

Types of Cluster Analysis with Examples

Benefits of cluster analysis, drawbacks of using cluster analysis.

  • How to use Cluster Analysis?

As a widely used statistical method, cluster analysis helps to identify groups of similar objects within a dataset, making it a valuable tool in fields such as market research , biology, and psychology. In this guide, we cover the definition of cluster analysis, explore its different types, and provide practical examples of its applications. By the end of this guide, you will have a thorough understanding of cluster analysis and its benefits, enabling you to make informed decisions when it comes to analyzing your own data.

What is a Cluster Analysis?

Cluster analysis is a statistical method used to group items into clusters based on how closely associated they are. It is an exploratory analysis that identifies structures within data sets and tries to identify homogenous groups of cases. Cluster analysis can handle binary, nominal, ordinal, and scale data, and it is often used in conjunction with other analyses such as discriminant analysis. The purpose of cluster analysis is to find similar groups of subjects based on a global measure over the whole set of characteristics.

Cluster analysis has many real-world applications, such as in unsupervised machine learning, data mining, statistics, Graph Analytics, image processing, and numerous physical and social science applications. In marketing, cluster analysis is used to segment customers into groups based on their purchasing behavior or preferences. In healthcare, it is used to identify patient subgroups with similar characteristics or treatment outcomes. In investor trading, cluster analysis is used to develop a diversified portfolio by grouping stocks that exhibit high correlations in returns into one basket, those slightly less correlated in another, and so on.

Hierarchical Clustering

K-means clustering, dbscan (density-based spatial clustering of applications with noise), fuzzy c-means clustering, mean shift clustering, affinity propagation, spectral clustering.

  • Identify Hidden Patterns and Trends One of the primary benefits of cluster analysis is its ability to reveal hidden patterns and trends within datasets. By grouping similar data points together, cluster analysis can help users uncover relationships and structures that may not be immediately apparent. This can lead to valuable insights, driving innovation and improving decision-making processes.
  • Enhance Decision-Making Cluster analysis enables organizations to make more informed decisions by providing a clear understanding of the relationships and patterns within their data. By identifying clusters, decision-makers can better target resources, develop tailored marketing strategies, and optimize product offerings to meet the needs of different customer segments or market niches.
  • Improve Data Organization and Visualization Cluster analysis can help simplify complex datasets by organizing data points into meaningful groups. This organization makes it easier to visualize and analyze large amounts of data, enabling users to quickly identify trends, outliers, and potential areas of interest. Additionally, clustering can be used to create more effective data visualizations, such as heatmaps or dendrograms, which can enhance communication and understanding of data-driven insights.
  • Enhance Customer Segmentation By applying cluster analysis to customer data, businesses can segment their customer base into distinct groups based on various attributes, such as demographics, purchasing behavior, and product preferences. This segmentation enables companies to tailor their marketing strategies and product offerings to better meet the needs of specific customer segments, ultimately leading to increased customer satisfaction and loyalty.
  • Streamline Anomaly Detection Cluster analysis can be used to identify outliers or anomalies in datasets, which can be crucial for detecting fraud, network intrusions, or equipment failures. By grouping data points based on their similarities, cluster analysis can effectively separate normal data from anomalous events, allowing organizations to quickly identify and address potential issues.
  • Optimize Resource Allocation In industries such as logistics, manufacturing, or urban planning, cluster analysis can help optimize resource allocation by identifying patterns in spatial or temporal data. For instance, by clustering delivery addresses or manufacturing facilities based on their geographic proximity, organizations can reduce transportation costs and improve overall efficiency.
  • Facilitate Machine Learning and Predictive Analytics Cluster analysis plays a critical role in machine learning and predictive analytics by serving as a preprocessing step for other techniques. For instance, clustering can be used to reduce the dimensionality of data before applying classification or regression algorithms, improving the performance and accuracy of predictive models. Additionally, cluster analysis can help identify subgroups within datasets, which can be used to develop more targeted machine learning models or generate more nuanced predictions.
  • Choice of Distance Metric and Clustering Algorithm The effectiveness of cluster analysis depends on the choice of distance metric and clustering algorithm. Different distance metrics, such as Euclidean, Manhattan, or cosine similarity, can produce varying results. Choosing the most appropriate metric for your dataset is crucial, as an unsuitable metric may lead to poor clustering results or misinterpretation of the data.
  • Sensitivity to Initial Conditions and Outliers Some clustering algorithms, such as K-means, are sensitive to initial conditions, meaning that different initializations can lead to different clustering results. This sensitivity can result in inconsistent outcomes, making it challenging to determine the optimal solution. Outliers can also significantly impact the performance of clustering algorithms. In some cases, the presence of outliers may cause clusters to become skewed or distorted, leading to inaccurate or misleading results. Robust algorithms that can handle outliers, such as DBSCAN, may be more suitable for such situations.
  • Determining the Optimal Number of Clusters Deciding on the optimal number of clusters is often a challenging task. In some algorithms, such as K-means, the number of clusters must be predefined, which can be problematic if the true number of clusters is unknown. Users must rely on heuristics or validation measures, such as the silhouette score or elbow method, to estimate the best number of clusters. These methods, however, may not always provide a definitive answer and are subject to interpretation.
  • Scalability and Computational Complexity Cluster analysis can become computationally expensive and time-consuming, particularly for large datasets. Some algorithms, such as hierarchical clustering, have high computational complexity, making them unsuitable for handling large amounts of data. In such cases, users may need to consider more efficient algorithms or implement techniques such as dimensionality reduction or data sampling to improve performance.

How to use Cluster Analysis

Analyzing clustering data is a crucial step in uncovering hidden patterns and structures within your dataset. By following a systematic approach, you can effectively identify meaningful groups and gain valuable insights from your data.

  • Preparing Your Data – Before diving into cluster analysis, it’s essential to prepare your data by cleaning and preprocessing it. This process may involve removing outliers, handling missing values, and scaling or normalizing features. Proper data preparation ensures that your clustering analysis produces accurate and meaningful results.
  • Choosing the Right Clustering Algorithm – There are various clustering algorithms available, each with its strengths and weaknesses. Consider the sample size , distribution, and shape of your dataset when selecting the most appropriate algorithm. Remember that no single algorithm is universally applicable, so it’s essential to choose the one that best suits your specific data characteristics.
  • Elbow Method: Plot the variance explained (or within-cluster sum of squares) against the number of clusters. Look for the “elbow” point, where adding more clusters results in only marginal improvements.
  • Silhouette Score : Calculate the silhouette score for different numbers of clusters and choose the one with the highest score.
  • Gap Statistic: Compare the within-cluster dispersion to a reference distribution and choose the number of clusters where the gap is the largest.
  • Applying the Clustering Algorithm – Once you have chosen the appropriate algorithm and determined the optimal number of clusters, apply the algorithm to your dataset. Most programming languages and data analysis tools, such as Python, R, or Excel, offer built-in functions or libraries for performing cluster analysis. Be sure to fine-tune any algorithm-specific parameters to ensure the best results.

In conclusion, cluster analysis is a powerful data mining technique that uncovers hidden patterns and structures within large datasets by grouping similar data points together. This guide has explored the fundamental concepts and techniques of cluster analysis, providing a strong foundation for leveraging this valuable tool in research and organizations.

One essential takeaway is the significance of understanding various clustering algorithms, each with its unique strengths and weaknesses. Selecting the most suitable algorithm for your dataset, such as K-means, hierarchical clustering, DBSCAN, or spectral clustering, is critical for obtaining accurate and meaningful results. Additionally, determining the optimal number of clusters, preparing data, and evaluating clustering results are crucial steps in the process.

FAQ on Cluster Analysis

What is cluster analysis, and why is it important.

Cluster analysis is a data mining technique that groups similar data points together based on their attributes, uncovering hidden patterns and structures within large datasets. It is important because it enables researchers and organizations to gain valuable insights, make informed decisions, and drive innovation.

How do I choose the right clustering algorithm for my data?

Selecting the right clustering algorithm depends on factors such as dataset size, distribution, and shape. Some popular algorithms include K-means, hierarchical clustering, DBSCAN, and spectral clustering. It's essential to understand the strengths and weaknesses of each algorithm and choose the one that best suits your data's unique characteristics.

How can I determine the optimal number of clusters for my dataset?

Several methods can help guide your decision, such as the Elbow Method, Silhouette Score, and Gap Statistic. Each method aims to identify the number of clusters that maximizes within-cluster cohesion and between-cluster separation, leading to meaningful and interpretable results.

How do I analyze cluster data?

Evaluating clustering results can be done through visual inspection, by plotting data points and color-coding them based on cluster assignments, or using metrics such as Silhouette Score and Adjusted Rand Index (ARI) to measure clustering performance. Visualizations like scatter plots, heatmaps, or dendrograms can also provide insights into the relationships between data points and overall data structure.

Related pages

Turf analysis.

Learn how TURF Analysis can optimize your product range and media plans. Unlock strategies to maximize market reach and improve ROI.

Regression Analysis

Explore the power of Regression Analysis to forecast trends, assess risks, and make data-driven decisions. .

Key Driver Analysis

Explore Key Driver Analysis (KDA): the game-changing statistical tool that identifies what really drives customer satisfaction and loyalty.

Discover how the Kano Model guides market research by categorizing customer needs. Optimize product features to boost satisfaction & ROI.

Van Westendorp Price Sensitivity Meter

Comprehensive guide to the Van Westendorp pricing model: ✓ Definition ✓ Implementation ✓ Graph ✓ Interpretation ► Get informed

Discover the t-test, a statistical method to compare group means, and learn how to calculate it to make data-driven decisions.

MaxDiff Scaling

Discover MaxDiff Scaling, a powerful technique to measure relative preferences, with real-world examples and guidance on effective usage.

Implicit Association Test

Uncover hidden biases with the Implicit Association Test. Delve into your subconscious preferences in a revealing psychological experiment.

Gabor-Granger Analysis

Learn to determine the optimal price with our Gabor-Granger analysis guide covering the basics, benefits, drawbacks, and tips.

Conjoint Analysis

Learn about conjoint analysis, a powerful market research technique used to determine how consumers value different product attributes.

  • Privacy Overview
  • Strictly Necessary Cookies
  • Additional Cookies

This website uses cookies to provide you with the best user experience possible. Cookies are small text files that are cached when you visit a website to make the user experience more efficient. We are allowed to store cookies on your device if they are absolutely necessary for the operation of the site. For all other cookies we need your consent.

You can at any time change or withdraw your consent from the Cookie Declaration on our website. Find the link to your settings in our footer.

Find out more in our privacy policy about our use of cookies and how we process personal data.

Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot properly without these cookies.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as additional cookies.

Please enable Strictly Necessary Cookies first so that we can save your preferences!

Home Blog Quantitative research What is cluster analysis? A complete guide

What is cluster analysis? A complete guide

importance of cluster analysis in marketing research

Ever heard of cluster analysis? If not, you’re in for a treat.

As a powerful data-mining tool, cluster analysis can help your organisation to identify different customer groups, and their typical behaviours. But why is that helpful? And what can you use cluster analysis for in the context of market research?  

In this complete guide, we’ll help you to understand what cluster analysis is, when to use it, and how it can help your business. We’ll also talk you through the cluster analysis process, cover different types of cluster analysis, and clear up how cluster analysis and factor analysis are different.

Let’s dive straight in.

What is cluster analysis?

Cluster analysis (otherwise known as clustering, segmentation analysis, or taxonomy analysis) is a statistical approach to grouping items – or people – into clusters, or categories.

The objective of cluster analysis is to sort subjects into groups based on similarities: if there’s a high degree of association, subjects would be placed into the same group. Conversely, a low degree of association would see subjects placed in different groups.

Cluster analysis is unique as a statistical method, in that it’s conducted without the foundation of an assumed principle or fact; instead, this type of analysis is primarily concerned with data matrices where variables haven’t yet been split into criterion vs. predictor subsets.

Wow, that was technical.

Still with us? Great.

Because cluster analysis is what’s known as an ‘unsupervised learning algorithm’, you won’t know how many different clusters you’re dealing with beforehand. In fact, this approach is specifically employed when no assumptions have previously been made about expected relationships within the data you’re studying.

Cluster analysis will give you insight into where patterns and associations may be present within specific data, but it won’t interpret what those associations are, or what they may mean.

How can cluster analysis be used?

Cluster analysis can be used to great effect in market research. Most commonly, cluster analysis is concerned with classification: in other words, arranging subjects into different groups based on certain similarities. The goal of classification is that subjects in the same group would be more like one another than to subjects in a different group.

In the context of market research, this is particularly helpful for splitting people into any number of useful categories – such as location, age bracket, earning potential, education level, and even buying behaviours.

For marketers, cluster analysis is invaluable for audience segmentation as it makes it possible to target specific groups of customers with relevant, tailored messaging – increasing the chances of creating a connection and eliciting a response from your intended audience.

From a public health perspective – and most notably seen throughout the Covid-19 crisis – cluster analysis can even be used by healthcare researchers to identify locations with particularly high (or low) levels of illness.

No matter what use cluster analysis is put to, it is invaluable for market research – where knowledge truly is power. In fact, we’re such big believers in the need to map out your market that we’ve developed industry-leading software to help you on your way.

Forsta’s  market research survey software  allows you to investigate your market (from target audience to top competitors), analyse the data, and act on the findings.  Find out more .

Cluster Analysis Process

So, what’s the cluster analysis process all about?

Three simple steps.

When you’re using cluster analysis to find out more about your target audience ahead of a big product launch or design iteration, it’s all about getting down to the nitty gritty of how they’re behaving, and what’s making them tick.

And this is how you do it.

Step one: create your survey

First things first, you need to build – and distribute – a survey. But what sort of survey, we hear you ask? Well, it should be designed to incorporate different measures of customer preference for your product, how likely they are to buy it, and the factors that could influence their decision. You need to send your survey to a decent sample size of your target customers, as if the sample size is too small, you won’t be able to elicit enough data to make statistically informed decisions.

Step two: analyse your findings

The next step is to reduce your data with a factor analysis of your survey (this minimises the factors that are being clustered by identifying multiple questions monitoring the same thing – allowing you to combine them  before  cluster analysis takes place). Once that’s done, you’ll be able to carry out a cluster analysis, figure out the right number of clusters, and get grouping!

Step three: act on your findings

When you have your clusters, you can view data across these different groupings (and name them according to their differences). It’s these essential differences that will allow you to tailor and target your marketing and advertising efforts according to specific groups of customers.

Types of cluster analysis

When it comes to choosing which type of cluster analysis to perform, you have three key methods to pick from: hierarchical cluster, K-means cluster, and the two-step cluster (which sounds a little like a dance, right?)

Let’s look at what each method brings to the table.

Hierarchical Cluster

Hierarchical clustering (the most common approach, if you’re asking), groups together variables in a way that’s reminiscent of factor analysis. It begins by treating every observation as a separate cluster, before repeatedly identifying the two clusters that are most similar, and then merging them. This continues until all clusters are merged – creating a set of clusters, with each cluster distinct from the other. Hierarchical cluster analysis can work with nominal, ordinal, and scale data – so long as you don’t mix in different levels of measurement.

K-Means Cluster

The K-means cluster comes into its own when you need to quickly cluster large sets of data. With this method, researchers define how many clusters there’ll be before carrying out the study. In fact, The K in ‘K-means’ stands for the number of clusters you’re trying to identify.

Two-Step Cluster

This best-of-both-worlds approach combines hierarchical and K-means clustering – automatically selecting the number of clusters. By carrying out pre-clustering first followed by hierarchical clustering, two-step clustering uses a cluster algorithm to identify different groups. This method is great for large datasets that hierarchical clustering would take too long to process.

When to use cluster analysis

Now that you understand a little more about the nature of cluster analysis, let’s look at when you ought to use it.

Cluster analysis is most often carried out during the initial, exploratory phase of research to uncover different structures in data. It doesn’t provide an explanation or interpretation of that data, but instead identifies specific groups within a population – without explaining why those structures exist the way they do. But that’s okay!

Cluster analysis is an important part of market research, as it presents different groupings for analysis. Once these groups have been defined, you can use the data surrounding each cluster to understand certain things about your target market: are they likely to buy your product? What sort of messaging will they respond to? What form of comms do they favour?

For that reason, cluster analysis is especially useful when you’re planning to launch a new product, update an old one, or rollout a marketing campaign. The ability to target potential customers in a focused, informed way truly is invaluable here. You can even use cluster analysis to create specific offers or products that are tailored to one particular group. Clever stuff!

Cluster analysis vs factor analysis?

Right then, what’s the big difference between cluster analysis and factor analysis?

We’ve already touched on this above, but factor analysis is basically a way of reducing large numbers of variables by removing overlapping questions that relate to the same concept, leaving you with a more refined number of clusters.

You’d use factor analysis when tackling a particularly complex survey or fighting your way through an inordinate number of variables. But rather than using factor analysis  in place of  cluster analysis, it’s best to use it beforehand – simplifying your data so that it’s easier to process and find true patterns.

Ultimately, the objectives of cluster analysis and factor analysis are different: cluster analysis is intended to divide observations into distinct and homogenous groups, while factor analysis is intended to explain the homogeneity of variables that result from similar values. Different, see?

Ready to get your cluster on?

Cluster analysis is a fantastic way of understanding the different groups of your target audience – and for determining if they’re your target audience at all.

At Forsta, we firmly believe that the more you know about your customers, the more powerful your product will be. So,  let’s get clustering .

Related stories

Crushing your cac: maximize click-buy-repeat.

Crushing your CAC: Maximize click-buy-repeat Webinar synopsis:Are soaring customer acquisition costs shackling your financial potential? Join Zack Hamilton as he shares battle-tested strategies to slash your customer acquisition costs by optimizing your customer’s journey at every touchpoint. Learn how to: Related resources

Crushing your CAC: Maximize click-buy-repeat

From click to collect: Turbocharging revenue with digital experiences

From click to collect: Turbocharging revenue with digital experiences From click to collect: Turbocharging revenue with digital experiences Digital experiences are the linchpin of customer journeys, influencing retention, acquisition –and of course, revenue. Get the tools and knowledge your brand needs to elevate customer satisfaction, break down organizational barriers, optimize for target segments, and leverage […]

From click to collect: Turbocharging revenue with digital experiences

The digital retail playbook: Four silos that rob brands of revenue

The digital retail playbook: Four silos that rob brands of revenue Join Zack Hamilton to learn why a fragmented digital strategy is robbing you of revenue. Get a step-by-step breakdown of the Digital Retail Playbook used by brands to: Related resources

The digital retail playbook: Four silos that rob brands of revenue

Learn more about our industry leading platform

Forsta newsletter, get industry insights that matter, delivered direct to your inbox.

We collect this information to send you free content, offers, and product updates. Visit our recently updated privacy policy for details on how we protect and manage your submitted data.

Introduction

Cluster analysis (or clustering ) is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets ( clusters or classes ), so that the data in each subset (ideally) share some common trait - often proximity according to some defined distance measure. Data clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. The computational task of classifying the data set into k clusters is often referred to as k -clustering.

Types of clustering

Data clustering algorithms can be hierarchical. Hierarchical algorithms find successive clusters using previously established clusters. Hierarchical algorithms can be agglomerative or divisive. Agglomerative algorithms begin with each element as a separate cluster and merge them into successively larger clusters. Divisive algorithms begin with the whole set and proceed to divide it into successively smaller clusters. Partitional algorithms typically determine all clusters at once, but can also be used as divisive algorithms in the hierarchical clustering.

Distance measures

An important step in any clustering is to select a distance measure, which will determine how the similarity of two elements is calculated. This will influence the shape of the clusters, as some elements may be close to one another according to one distance and further away according to another.

Particularly important distance measures are the Euclidean distance which leads to a spherical shape of the clusters, and the Mahalanobis distance , which leads to arbitrary elliptic shapes, reflecting the different scales and correlations in the data.

SNAP cluster analysis tools

Further information.

A good starting point for obtaining further information on cluster analysis terms and algorithms is the Wikipedia entry on data clustering .

Root out friction in every digital experience, super-charge conversion rates, and optimize digital self-service

Uncover insights from any interaction, deliver AI-powered agent coaching, and reduce cost to serve

Increase revenue and loyalty with real-time insights and recommendations delivered to teams on the ground

Know how your people feel and empower managers to improve employee engagement, productivity, and retention

Take action in the moments that matter most along the employee journey and drive bottom line growth

Whatever they’re are saying, wherever they’re saying it, know exactly what’s going on with your people

Get faster, richer insights with qual and quant tools that make powerful market research available to everyone

Run concept tests, pricing studies, prototyping + more with fast, powerful studies designed by UX research experts

Track your brand performance 24/7 and act quickly to respond to opportunities and challenges in your market

Explore the platform powering Experience Management

  • Free Account
  • For Digital
  • For Customer Care
  • For Human Resources
  • For Researchers
  • Financial Services
  • All Industries

Popular Use Cases

  • Customer Experience
  • Employee Experience
  • Employee Exit Interviews
  • Net Promoter Score
  • Voice of Customer
  • Customer Success Hub
  • Product Documentation
  • Training & Certification
  • XM Institute
  • Popular Resources
  • Customer Stories
  • Market Research
  • Artificial Intelligence
  • Partnerships
  • Marketplace

The annual gathering of the experience leaders at the world’s iconic brands building breakthrough business results, live in Sydney.

language

  • English/AU & NZ
  • Español/Europa
  • Español/América Latina
  • Português Brasileiro
  • REQUEST DEMO
  • Experience Management
  • Survey Analysis & Reporting
  • Cluster Analysis

Try Qualtrics for free

What is cluster analysis overview and examples.

14 min read Cluster analysis can be a powerful data-mining tool for any organisation or research project. Here we breakdown what it is, when it’s useful and why – with plenty of examples along the way.

What is cluster analysis?

Cluster analysis is a statistical method for processing data. It works by organising items into groups – or clusters – based on how closely associated they are.

cluster analysis graph

Cluster analysis, like dimension reduction analysis ( factor analysis ), is concerned with data collection in which the variables have not been partitioned beforehand into criterion vs. predictor subsets.

The objective of cluster analysis is to find similar groups of subjects, where the “similarity” between each pair of subjects represents a unique characteristic of the group vs. the larger population/sample. Strong differentiation between groups is indicated through separate clusters; a single cluster indicates extremely homogeneous data.

Cluster analysis is an unsupervised learning algorithm, meaning that you don’t know how many clusters exist in the data before running the model. Unlike many other  statistical methods , cluster analysis is typically used when there is no assumption made about the likely relationships within the data. It provides information about where associations and patterns in data exist, but not what those might be or what they mean.

Free eBook: 2024 market research trends report

When should cluster analysis be used?

Cluster analysis is for when you’re looking to segment or categorise a dataset into groups based on similarities, but aren’t sure what those groups should be.

While it’s tempting to use cluster analysis in many different research projects, it’s important to know when it’s genuinely the right fit. Here are three of the most common scenarios where cluster analysis proves its worth.

Exploratory data analysis

When you have a new dataset and are in the early stages of understanding it, cluster analysis can provide a much-needed guide.

By forming clusters, you can get a read on potential patterns or trends that could warrant deeper investigation.

Market segmentation

This is a golden application for cluster analysis, especially in the business world. Because when you aim to target your products or services more effectively, understanding your customer base becomes paramount.

Cluster analysis can carve out specific customer segments based on buying habits, preferences or demographics, allowing for tailored marketing strategies that resonate more deeply.

Resource allocation

Be it in healthcare, manufacturing, logistics or many other sectors, resource allocation is often one of the biggest challenges. Cluster analysis can be used to identify which groups or areas require the most attention or resources, enabling more efficient and targeted deployment.

How is cluster analysis used?

The most common use of cluster analysis is classification. Subjects are separated into groups so that each subject is more similar to other subjects in its group than to subjects outside the group.

In a  market research  context, cluster analysis might be used to identify categories like age groups, earnings brackets, urban, rural or suburban location.

In marketing, cluster analysis can be used for  audience segmentation , so that different customer groups can be targeted with the most relevant messages.

Healthcare researchers might use cluster analysis to find out whether different geographical areas are linked with high or low levels of certain illnesses, so they can investigate possible local factors contributing to health problems.

Employers, on the other hand, could use cluster analysis to identify groups of employees who have similar feelings about workplace culture, job satisfaction or career development. With this data, HR departments can tailor their initiatives to better suit the needs of specific clusters, like offering targeted training programs or improving office amenities.

Whatever the application, data cleaning is an essential preparatory step for successful cluster analysis. Clustering works at a data-set level where every point is assessed relative to the others, so the data must be as complete as possible.

Cluster analysis in action: A step-by-step example

Here’s how an online bookstore used cluster analysis to transform its raw data into actionable insights.

Step one: Creating the objective

The bookstore’s aim is to provide more personalized book recommendations to its customers. The belief is that by curating book selections that will be more appealing to subgroups of its customers, the bookstore will see an increase in sales.

Step two: Using the right data

The bookstore has its own historical sales data, including two key variables: ‘favorite genre’, which includes categories like sci-fi, romance and mystery; and ‘average spend per visit’.

The bookstore opts to hone in on these two factors as they are likely to provide the most actionable insights for personalized marketing strategies.

Step three: Choosing the best approach

After settling on the variables, the next decision is determining the right analytical approach.

The bookstore opts for K-means clustering for the ‘average spend per visit’ variable because it’s numerical – and therefore scalar data. For ‘favorite genre’, which is categorical – and therefore non-scalar data – they choose K-medoids.

Step four: Running the algorithm

With everything set, it’s time to crunch the numbers. The bookstore runs the K-means and K-medoids clustering algorithms to identify clusters within their customer base.

The aim is to create three distinct clusters, each encapsulating a specific customer profile based on their genre preferences and spending habits.

Step five: Validating the clusters

Once the algorithms have done their work, it’s important to check the quality of the clusters. For this, the bookstore looks at intracluster and intercluster distances.

A low intracluster distance means customers within the same group are similar, while a high intercluster distance ensures the groups are distinct from each other. In other words, the customers within each group are similar to one another and the group of customers are distinct from one another.

Step six: Interpreting the results

Now that the clusters are validated, it’s time to dig into what they actually mean. Each cluster should represent a specific customer profile based solely on ‘favourite genre’ and ‘average spend per visit’.

For example, one cluster might consist of customers who are keen on sci-fi and tend to spend less than $20, while another cluster could be those who prefer romance novels and are in the $20-40 spending range.

Step seven: Applying the findings

The final step is all about action. Armed with this new understanding of their customer base, the bookstore can now tailor its marketing strategies.

Knowing what specific subgroups like to read and how much they’re willing to spend, the store can send out personalised book recommendations or offer special discounts to those specific clusters – aiming to increase sales and customer satisfaction.

Cluster analysis algorithms

Your choice of cluster analysis algorithm is important, particularly when you have mixed data. In major statistics packages you’ll find a range of preset algorithms ready to number-crunch your matrices.

K-means and K-medoid are two of the most suitable clustering methods. In both cases (K) = the number of clusters.

k-means and k-medoids clustering

The K-means algorithm establishes the presence of clusters by finding their centroid points. A centroid point is the average of all the data points in the cluster. By iteratively assessing the Euclidean distance between each point in the dataset, each one can be assigned to a cluster.

The centroid points are random to begin with and will change each time as the process is carried out. K-means is commonly used in cluster analysis, but it has a limitation in being mainly useful for scalar data.

K-medoid works in a similar way to K-means, but rather than using mean centroid points which don’t equate to any real points from the dataset, it establishes medoids, which are real interpretable data-points.

The K-medoids clustering algorithm offers an advantage for survey data analysis as it is suitable for both categorical and scalar data. This is because rather than measuring Euclidean distance between the medoid point and its neighbours, the algorithm can measure distance in multiple dimensions, representing a number of different categories or variables.

K-medoids is less common than K-means in clustering analysis, but is often used when a more robust method that’s less sensitive to outliers is needed.

Measuring clusters using intracluster and intercluster distances

Evaluating the quality of clustering involves a two-pronged approach: assessing intracluster and intercluster distances.

Intracluster distance  is the distance between the data points inside the cluster. If there is a strong clustering effect present, this should be small (more homogenous).

Intercluster distance  is the distance between data points in different clusters. Where strong clustering exists, these should be large (more heterogenous).

In an ideal clustering scenario, you’d use both measures to gauge how good your clusters are. Low intracluster distances – known as high intra-cluster similarity – mean items in the same cluster are similar, which is good; high intercluster distances – known as low inter-cluster similarity – mean different clusters are well-separated, which is also good.

Using both measures gives you a fuller picture of how effective your clustering is.

Differing cluster variations

Key considerations in cluster analysis

When getting started with cluster analysis, it makes sense to start with methods that assign each data point to a single, distinct cluster. It’s commonly accepted that within each cluster, the data points share similarities.

The assumption here is that your data set is composed of different, unordered classes, and that none of these classes are inherently more important than the others. In some cases, however, we may also view these classes as hierarchical in nature, with sub-classes within them – here we could apply hierarchical clustering and hierarchical cluster analysis.

Cluster analysis is often a “preliminary” step. That means before you even start, you’re not applying any previous judgments to split up your data; you’re working on the notion that natural clusters should exist within the data.

This initial approach differs from techniques like discriminant analysis, where you have a dependent variable guiding the classification. In cluster analysis, however, the focus is purely on inherent similarities within the data collection itself.

So, the key questions for cluster analysis would be:

  • What metrics will you use to measure the similarity between data points, and how will each variable be weighted when calculating this measure?
  • Once you’ve determined the similarities, what methods will you use to form the clusters?
  • After forming clusters, what descriptive metrics will help define the nature of each cluster?
  • Assuming you’ve adequately described your clusters, what can you infer about their statistical significance?

This should offer a clearer yet still approachable overview of the essential questions in cluster analysis.

Non-scalar data in cluster analysis

So far, we’ve mainly talked about scalar data – things that differ from each other by degrees along a scale, such as numerical quantity or degree. But what about items that are non-scalar and can only be sorted into categories?

When you’re dealing with such categories like color, species and shape, you can’t easily measure the “distance” between data points like you can with scalar data. Various techniques, like using dummy variables or specialised distance measures, can be employed to include non-scalar data in your cluster analysis.

Dummy variables are a way to convert categories into a format that can be provided to a mathematical model. For example, if you have a color category with options like red, blue and green, you could create separate “dummy” columns for each colour, marking them as 1 if they apply and 0 if they don’t.

Specialised distance measures , on the other hand, are custom calculations designed to figure out how “far apart” different categories are from each other. For example, if you’re clustering based on movie genres, a specialised measure might decide that “action” and “adventure” are closer to each other than “action” and “romance”.

Ideally, the data for cluster analysis is categorical, interval or ordinal data. Using a mix of these types can complicate the analysis, as you’ll need to figure out how to meaningfully compare different kinds of data. It’s doable, but it adds an extra layer of complexity you’ll need to account for.

Cluster analysis and factor analysis

When you’re dealing with a large number of variables – for example a lengthy or complex  survey  – it can be useful to simplify your data before performing cluster analysis so that it’s easier to work with. Using factors reduces the number of dimensions that you’re clustering on, and can result in clusters that are more reflective of the true patterns in the data.

Factor analysis  is a technique for taking large numbers of variables and combining those that relate to the same underlying factor or concept, so that you end up with a smaller number of dimensions. For example, factor analysis might help you replace questions – like “Did you receive good service?”, “How confident were you in the agent you spoke to?” and “Did we resolve your query?” – with a single factor:  customer satisfaction .

This way you can reduce messiness and complexity in your data and arrive more quickly at a manageable number of clusters.

Related resources

Analysis & Reporting

Sentiment Analysis 20 min read

Thematic analysis 11 min read, predictive analytics 19 min read, descriptive statistics 15 min read, statistical significance calculator 18 min read, data analysis 29 min read, regression analysis 19 min read, request demo.

Ready to learn more about Qualtrics?

IMAGES

  1. A Step-By-Step Guide To Cluster Analysis In Predictive Analytics

    importance of cluster analysis in marketing research

  2. Cluster Analysis In Market Research: Quick Guide

    importance of cluster analysis in marketing research

  3. Chapter 10 Cluster Analysis Basic Concepts and Methods

    importance of cluster analysis in marketing research

  4. PPT

    importance of cluster analysis in marketing research

  5. PPT

    importance of cluster analysis in marketing research

  6. PPT

    importance of cluster analysis in marketing research

VIDEO

  1. CLUSTER ANALYSIS

  2. Cluster Analysis by Ms. Kajol Kathuria

  3. A Gentle Introduction to Cluster Analysis

  4. Cluster Analysis Part 1

  5. Cluster Analysis, Research Methodology for Social Sciences

  6. Understand Marketing in 4 Minutes

COMMENTS

  1. Cluster Analysis in Marketing Research

    Abstract. Cluster analysis is an exploratory tool for compressing data into a smaller number of groups or representing points. The latter aims at sufficiently summarizing the underlying data structure and as such can serve the analyst for further consideration instead of dealing with the complete data set.

  2. Cluster Analysis

    Both cluster analysis, and market segmentation involve grouping customer segments based on similarities. While segmentation is based on human input, cluster analysis is driven by machine learning. Cluster analysis provides insights that allow businesses to drill down into the needs and wants of each market segment, allowing them to offer more personalized products and messaging.

  3. Cluster analysis: Insights into target groups, markets & products

    Cluster analysis emerges as a versatile powerhouse in the realm of market research. This statistical method enables the identification of patterns and clusters within data, where shared characteristics or properties bind elements together. Homogeneous groups, referred to as 'clusters,' encapsulate akin data points or objects.

  4. Cluster Analysis in Marketing Research: Review and Suggestions for

    Applications of cluster analysis to marketing problems are reviewed. Alternative. methods of cluster analysis are presented and evaluated in terms of recent empirical work on their performance characteristics. A two-stage cluster analysis methodology is recommended: preliminary identification of clusters via Ward's minimum variance.

  5. Introducing Clustering with a Focus in Marketing and Consumer Analysis

    Furthermore, as with any data analysis method, it is important to be able to evaluate or compare our results and we include a brief explanation of some approaches to do this. ... Cluster analysis in marketing research: Review and suggestions for application. Journal of Marketing Research, 20(2):pp. 134-148, 1983.

  6. Cluster Analysis in Marketing Research: Review and Suggestions for

    Applications of cluster analysis to marketing problems are reviewed. Alternative methods of cluster analysis are presented and evaluated in terms of recent empirical work on their performance characteristics.

  7. Cluster Analysis In Market Research: Quick Guide

    1. Cluster analysis is a great way to identify different customer segments. 2. It can help you understand how customers interact with your product or service. 3. Cluster analysis can help you ...

  8. What is Cluster Analysis in Marketing?

    The general purpose of cluster analysis in marketing is to construct groups or clusters while ensuring that the observations are as similar as possible within a group. Ultimately, the purpose depends on the application. In marketing, clustering helps marketers discover distinct groups of customers in their customer base.

  9. Cluster Analysis for Marketers: The Ultimate Guide

    For clustering of customers and prospects, you can use the clusters to. customize your re-targeting and re-marketing strategies. better adjust promotional and other types of marketing messages. customize the product for the various personas to better fit their needs. personalize the website design and UI.

  10. PDF Cluster Analysis in Marketing Research

    an overview of the various approaches and methods for cluster analysis and links them with the most relevant marketing research contexts. We also provide pointers to the specific packages and functions for performing cluster analysis T. Reutterer (*) Department of Marketing, WU Vienna University of Economics and Business, Vienna, Austria

  11. The Application of Cluster Analysis in Marketing Research: A Literature

    Market segmentation is a widely accepted concept in marketing research and planning (Myers, 1996), and cluster analysis provides a plentitude of techniques frequently employed in determining the ...

  12. Cluster Analysis: What it is & How to Use It

    Cluster analysis is a critical component of data analysis in market research that aids brands with deriving trends, identifying groups among various demographics of customers, purchase behaviors, likes and dislikes, and more. This analysis method in the market research process provides insights to bucket information into smaller groups that ...

  13. Cluster Analysis: Definition, Types, Tipps and Examples

    As a widely used statistical method, cluster analysis helps to identify groups of similar objects within a dataset, making it a valuable tool in fields such as market research, biology, and psychology.In this guide, we cover the definition of cluster analysis, explore its different types, and provide practical examples of its applications.

  14. Cluster Analysis: Definition and Examples

    In a market research context, cluster analysis might be used to identify categories like age groups, earnings brackets, urban, rural or suburban location. In marketing, ... Your choice of cluster analysis algorithm is important, particularly when you have mixed data. In major statistics packages you'll find a range of preset algorithms ready ...

  15. What is cluster analysis? A complete guide

    Cluster analysis can be used to great effect in market research. Most commonly, cluster analysis is concerned with classification: in other words, arranging subjects into different groups based on certain similarities. The goal of classification is that subjects in the same group would be more like one another than to subjects in a different group.

  16. PDF Advances in Cluster Analysis Relevant to Marketing Research

    2. Substantive Uses of Cluster Analysis in Marketing Punj, Stewart (1983) listed the most common applications of cluster analy­ sis in marketing research as: market segmentation, identifying homogeneous groups of buyers, development of potential new product opportunities, test market selection, and as a general data reduction technique.

  17. Using Cluster Analysis for Market Research

    In its most general definition, a cluster is a group of similar things or people positioned or occurring closely together. In market research, a cluster is a collection of data objects that are similar and dissimilar to each other. The primary objective of cluster analysis is to classify objects into relatively homogeneous groups based on a set ...

  18. Cluster Analysis in 5 Steps

    Step 3: Prepare Your Data. Preparing your data before performing cluster analysis is crucial and ensures a more effective clustering process. Start by cleaning the data, which includes handling ...

  19. Cluster Analysis

    Punj G, Stewart DW (1983) Cluster analysis in marketing research: review and suggestions for application. J Mark Res 20(2):134-148. In this seminal article, the authors discuss several issues in applications of cluster analysis and provide further theoretical discussion of the concepts and rules of thumb that we included in this chapter.

  20. SNAP cluster analysis tools

    Introduction. Cluster analysis (or clustering) is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters or classes), so that the data in each subset (ideally) share some common trait - often proximity according to some defined distance measure.Data clustering is a common technique for statistical data analysis, which is ...

  21. What Is Cluster Analysis? When Should You Use It

    Unlike many other statistical methods, cluster analysis is typically used when there is no assumption made about the likely relationships within the data. It provides information about where associations and patterns in data exist, but not what those might be or what they mean. Free eBook: 2024 market research trends report.

  22. The state of marketing analytics in research and practice

    This paper presents a systematic review of marketing research on the burgeoning new area of "marketing analytics" and considers the importance of marketing analytics for marketing research and practice. This article contributes to the marketing literature with a systematic review of studies and findings on marketing analytics, which allow for further recommendations. We identify the ...