There are two primary keyword clustering methods: semantic clustering and clustering based on search engine results pages (SERP) analysis.
Both approaches aim to group relevant keywords together to improve your SEO strategy, but they differ in their underlying principles. Both methodologies have advantages and drawbacks.
Semantic Keyword Clustering
Semantic clustering focuses on understanding the intent behind the keywords and identifying common themes, topics, or ideas that the keywords represent.
This approach uses natural language processing (NLP) and machine learning algorithms to turn words into numerical representations (embeddings), which can then be mathematically compared to each other.
Typically, the representation is a real-valued vector that encodes the meaning of the word in such a way that words that are closer in the vector space are expected to be similar in meaning.
Advantages
- It’s very fast — semantic clustering generally has less data to process, which leads to faster results.
- It’s cheap — semantic clustering does not need any input other than the keywords themselves, so no expensive API calls or web crawling is required.
Drawbacks
- Not very actionable for SEO — different NLP algorithms and AIs will generate different results. Google’s own algorithm is not public so there’s no way to tell how well these results match with Google’s calculations.
SERP Keyword Clustering
Clustering based on SERP analysis means grouping keywords by examining the search results for each keyword.
If two keywords produce overlapping results in a search engine, they can be considered similar.
This approach takes into consideration the actual search engine results and
therefore better represents the user intent behind the search queries.
Advantages
- Extremely actionable for SEO — because this method clusters based on real-time SERP results, it is essentially reverse engineering the search engine. The clustering output is an exact snapshot of the search engine’s algorithm output at any given time. Replicating this logic on a domain has inherent SEO value.
Drawbacks
- Slower — before clustering can start, SERP data needs to be gathered for each keyword.
- More expensive — crawling SERP data is expensive.
Tools for Semantic Keyword Clustering
The quality of semantic clustering depends largely on the technology used for it. Therefore each of these tools will give vastly different results. As these methods progress, the output will get better and they will be playing catch-up with one another.
Zenbrief has created a free and easy-to-use semantic clustering tool.
It’s also relatively easy to build a semantic clustering tool for personal use with Google Colab and Python (here’s an example by Lee Foot).
Some more popular ways to do that:
- OpenAI: OpenAI has an embeddings endpoint that can be combined with clustering algorithms such as K-means, DBSCAN, or hierarchical clustering, to group semantically related keywords or documents together. This approach leverages the advanced natural language understanding capabilities of OpenAI models, resulting in more accurate and meaningful semantic clusters.
- TensorFlow and Keras: TensorFlow is an open-source machine learning framework developed by Google, and Keras is a high-level neural network library built on top of TensorFlow. These libraries can be used to create custom deep learning models for semantic clustering, such as Word2Vec or Doc2Vec.
- Scikit-learn: Scikit-learn is a popular Python library for machine learning and data mining. It provides various clustering algorithms like K-means, DBSCAN, and hierarchical clustering, which can be used for semantic clustering when combined with other NLP techniques or word embeddings.
- BERT (Bidirectional Encoder Representations from Transformers): BERT is a pre-trained transformer model developed by Google that has shown impressive results in various NLP tasks. You can fine-tune BERT for semantic clustering or use pre-trained embeddings to represent your text data before applying clustering algorithms.
Tools for SERP Keyword Clustering
SERP clustering tools, however, will all more or less produce the same results, because the methodology is (or at least should be) exactly the same. The clustering output depends only on the SERP at any given time.
Popular tools include KeywordInsights, KeywordCupid and Thruuu, but they all require paid subscriptions.
The same results can be achieved with ContentGecko’s free SERP clustering tool.
For SEOs, SERP-based Keyword Clustering is the way to go
SERP-based clustering offers many advantages over semantic clustering for SEOs specifically. It provides a more actionable approach, as it directly analyzes the search engine results pages (SERPs) to understand user intent and search engine ranking factors.
By clustering keywords based on real-time SERP data, SEOs can effectively reverse-engineer the search engine's algorithm, yielding valuable insights into how to optimize their content and website structure.
Additionally, SERP-based clustering is more closely aligned with the actual search engine results, providing a better representation of the keywords and their relevance to users' search queries.
SERP-based clustering has some drawbacks such as slower processing times and higher costs due to the need for web crawling and data gathering.
Despite these challenges, the benefits of SERP-based clustering for SEOs far outweigh the drawbacks, making it the preferred choice for optimizing keyword strategies and improving search rankings.
ContentGecko’s free SERP clustering tool should be your go-to webpage for this.
Do you want to know your objective organic growth potential?
ContentGecko only works with companies to which it can provide a positive ROI.
We will let you know if you’re eligible for ContentGecko’s platform and send you a short and sweet, yet informative executive overview.