Thanks! We'll be in touch in the next 12 hours
Oops! Something went wrong while submitting the form.

Vector Search: The New Frontier in Personalized Recommendations

Afshan Khan

Artificial Intelligence / Machine Learning

Introduction

Imagine you are a modern-day treasure hunter, not in search of hidden gold, but rather the wealth of knowledge and entertainment hidden within the vast digital ocean of content. In this realm, where every conceivable topic has its own sea of content, discovering what will truly captivate you is like finding a needle in an expansive haystack.

This challenge leads us to the marvels of recommendation services, acting as your compass in this digital expanse. These services are the unsung heroes behind the scenes of your favorite platforms, from e-commerce sites that suggest enticing products to streaming services that understand your movie preferences better than you might yourself. They sift through immense datasets of user interactions and content features, striving to tailor your online experience to be more personalized, engaging, and enriching.

But what if I told you that there is a cutting-edge technology that can take personalized recommendations to the next level? Today, I will take you through a journey to build a blog recommendation service that understands the contextual similarities between different pieces of content, transcending beyond basic keyword matching. We'll harness the power of vector search, a technology that's revolutionizing personalized recommendations. We'll explore how recommendation services are traditionally implemented, and then briefly discuss how vector search enhances them.

Finally, we'll put this knowledge to work, using OpenAI's embedding API and Elasticsearch to create a recommendation service that not only finds content but also understands and aligns it with your unique interests.

Exploring the Landscape: Traditional Recommendation Systems and Their Limits

Traditionally, these digital compasses, or recommendation systems, employ methods like collaborative and content-based filtering. Imagine sitting in a café where the barista suggests a coffee based on what others with similar tastes enjoyed (collaborative filtering) or based on your past coffee choices (content-based filtering). While these methods have been effective in many scenarios, they come with some limitations. They often stumble when faced with the vast and unstructured wilderness of web data, struggling to make sense of the diverse and ever-expanding content landscape. Additionally, when user preferences are ambiguous or when you want to recommend content by truly understanding it on a semantic level, traditional methods may fall short.

Enhancing Recommendation with Vector Search and Vector Databases

Our journey now takes an exciting turn with vector search and vector databases, the modern tools that help us navigate this unstructured data. These technologies transform our café into a futuristic spot where your coffee preference is understood on a deeper, more nuanced level.

Vector Search: The Art of Finding Similarities

Vector search operates like a seasoned traveler who understands the essence of every place visited. Text, images, or sounds can be transformed into numerical vectors, like unique coordinates on a map. The magic happens when these vectors are compared, revealing hidden similarities and connections, much like discovering that two seemingly different cities share a similar vibe.

Vector Databases: Navigating Complex Data Landscapes

Imagine a vast library of books where each book captures different aspects of a place along with its coordinates. Vector databases are akin to this library, designed to store and navigate these complex data points. They easily handle intricate queries over large datasets, making them perfect for our recommendation service, ensuring no blog worth reading remains undiscovered.

Embeddings: Semantic Representation

In our journey, embeddings are akin to a skilled artist who captures not just the visuals but the soul of a landscape. They map items like words or entire documents into real-number vectors, encapsulating their deeper meaning. This helps in understanding and comparing different pieces of content on a semantic level, letting the recommendation service show you things that really match your interests.

Sample Project: Blog Recommendation Service

Project Overview

Now, let’s craft a simple blog recommendation service using OpenAI's embedding APIs and Elasticsearch as a vector database. The goal is to recommend blogs similar to the current one the user is reading, which can be shown in the read more or recommendation section.

Our blogs service will be responsible for indexing the blogs, finding similar one,  and interacting with the UI Service.

Tools and Setup

We will need the following tools to build our service:

  • OpenAI Account: We will be using OpenAI’s embedding API to generate the embeddings for our blog content. You will need an OpenAI account to use the APIs. Once you have created an account, please create an API key and store it in a secure location.
  • Elasticsearch: A popular database renowned for its full-text search capabilities, which can also be used as a vector database, adept at storing and querying complex embeddings with its dense_vector field type.
  • Docker: A tool that allows developers to package their applications and all the necessary dependencies into containers, ensuring that the application runs smoothly and consistently across different computing environments.
  • Python: A versatile programming language for developers across diverse fields, from web development to data science.

The APIs will be created using the FastAPI framework, but you can choose any framework.

Steps

First, we'll create a BlogItem class to represent each blog. It has only three fields, which will be enough for this demonstration, but real-world entities would have more details to accommodate a wider range of properties and functionalities.

CODE: https://gist.github.com/velotiotech/b50e70f80381f8e3b994ad4ee854c5ae.js

Elasticsearch Setup:

  • To store the blog data along with its embedding in Elasticsearch, we need to set up a local Elasticsearch cluster and then create an index for our blogs. You can also use a cloud-based version if you have already procured one for personal use.
  • Install Docker or Docker Desktop on your machine and create Elasticsearch and Kibana docker containers using the below docker compose file. Run the following command to create and start the services in the background:
  • docker compose -f /path/to/your/docker-compose/file up -d. 
  • You can exclude the file path if you are in the same directory as your docker-compose.yml file. The advantage of using docker compose is that it allows you to clean up these resources with just one command.
  • docker compose -f /path/to/your/docker-compose/file down.

CODE: https://gist.github.com/velotiotech/31fac5f0e78ed297da88daa8447d6622.js

  • Connect to the local ES instance and create an index. Our “blogs” index will have a unique blog ID, blog title, blog content, and an embedding field to store the vector representation of blog content. The text-embedding-ada-002 model we have used here produces vectors with 1536 dimensions; hence, it’s important to use the same in our embeddings field in the blogs index.

CODE: https://gist.github.com/velotiotech/2248079c895cb621512b0a151fa94b97.js

Create Embeddings AND Index Blogs:

  • We use OpenAI's Embedding API to get a vector representation of our blog title and content. I am using the 002 model here, which is recommended by Open AI for most use cases. The input to the text-embedding-ada-002 should not exceed 8291 tokens (1000 tokens are roughly equal to 750 words) and cannot be empty.

CODE: https://gist.github.com/velotiotech/1a836a20eb8b8277f84c45bd678058a5.js

  • When the blogs get created or the content of the blog gets updated, we will call the create_embeddings function to get text embedding and store it in our blogs index.

CODE: https://gist.github.com/velotiotech/e33b688678157db170624979bd32f05e.js

  • Create a Pydantic model for the request body:

CODE: https://gist.github.com/velotiotech/a17f9d11d0c7b57fd7596e986ed36f35.js

  • Create an API to save blogs to Elasticsearch. The UI Service would call this API when a new blog post gets created.

CODE: https://gist.github.com/velotiotech/3645558a664d3e2c019ef0eebdf56d8e.js

Finding Relevant Blogs:

  • To find blogs that are similar to the current one, we will compare the current blog’s vector representation with other blogs present in the Elasticsearch index using the cosine similarity function.
  • Cosine similarity is a mathematical measure used to determine the cosine of the angle between two vectors in a multi-dimensional space, often employed to assess the similarity between two documents or data points. 
  • The cosine similarity score ranges from -1 to 1. As the cosine similarity score increases from -1 to 1, it indicates an increasing degree of similarity between the vectors. Higher values represent greater similarity.
  • Create a custom exception to handle a scenario when a blog for a given ID is not present in Elasticsearch.

CODE: https://gist.github.com/velotiotech/dc2f7ddb83085ef926d818aad805c174.js

  • First, we will check if the current blog is present in the blogs index and get its embedding. This is done to prevent unnecessary calls to Open AI APIs as it consumes tokens. Then, we would construct an Elasticsearch dsl query to find the nearest neighbors and return their blog content.

CODE: https://gist.github.com/velotiotech/25d0ad2726e1688e6af57f672b19db89.js

  • Define a Pydantic model for the response:

CODE: https://gist.github.com/velotiotech/28dc7074b446c432a47f8b88cae81529.js

  • Create an API that would be used by UI Service to find similar blogs as the current one user is reading:

CODE: https://gist.github.com/velotiotech/1d2cb865193484a872528840c0c834d8.js

  • The below flow diagram summarizes all the steps we have discussed so far:

Testing the Recommendation Service

  • Ideally, we would be receiving the blog ID from the UI Service and passing the recommendations back, but for illustration purposes, we’ll be calling the recommend blogs API with some test inputs from my test dataset. The blogs in this sample dataset have concise titles and content, which are sufficient for testing purposes, but real-world blogs will be much more detailed and have a significant amount of data. The test dataset has around 1000 blogs on various categories like healthcare, tech, travel, entertainment, and so on.
  • A sample from the test dataset:


  • Test Result 1: Medical Research Blog

    Input Blog: Blog_Id: 1, Title: Breakthrough in Heart Disease Treatment, Content: Researchers have developed a new treatment for heart disease that promises to be more effective and less invasive. This breakthrough could save millions of lives every year.


  • Test Result 2: Travel Blog

    Input Blog: Blog_Id: 4, Title: Travel Tips for Sustainable Tourism, Content: How to travel responsibly and sustainably.

I manually tested multiple blogs from the test dataset of 1,000 blogs, representing distinct topics and content, and assessed the quality and relevance of the recommendations. The recommended blogs had scores in the range of 87% to 95%, and upon examination, the blogs often appeared very similar in content and style.

Based on the test results, it's evident that utilizing vector search enables us to effectively recommend blogs to users that are semantically similar. This approach ensures that the recommendations are contextually relevant, even when the blogs don't share identical keywords, enhancing the user's experience by connecting them with content that aligns more closely with their interests and search intent.

Limitations

This approach for finding similar blogs is good enough for our simple recommendation service, but it might have certain limitations in real-world applications.

  • Our similarity search returns the nearest k neighbors as recommendations, but there might be scenarios where no similar blog might exist or the neighbors might have significant score differences. To deal with this, you can set a threshold to filter out recommendations below a certain score. Experiment with different threshold values and observe their impact on recommendation quality. 
  • If your use case involves a small dataset and the relationships between user preferences and item features are straightforward and well-defined, traditional methods like content-based or collaborative filtering might be more efficient and effective than vector search.

Further Improvements

  • Using LLM for Content Validation: Implement a verification step using large language models (LLMs) to assess the relevance and validity of recommended content. This approach can ensure that the suggestions are not only similar in context but also meaningful and appropriate for your audience.
  • Metadata-based Embeddings: Instead of generating embeddings from the entire blog content, utilize LLMs to extract key metadata such as themes, intent, tone, or key points. Create embeddings based on this extracted metadata, which can lead to more efficient and targeted recommendations, focusing on the core essence of the content rather than its entirety.

Conclusion

Our journey concludes here, but yours is just beginning. Armed with the knowledge of vector search, vector databases, and embeddings, you're now ready to build a recommendation service that doesn't just guide users to content but connects them to the stories, insights, and experiences they seek. It's not just about building a service; it's about enriching the digital exploration experience, one recommendation at a time.

Get the latest engineering blogs delivered straight to your inbox.
No spam. Only expert insights.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

Explore current openings

Vector Search: The New Frontier in Personalized Recommendations

Introduction

Imagine you are a modern-day treasure hunter, not in search of hidden gold, but rather the wealth of knowledge and entertainment hidden within the vast digital ocean of content. In this realm, where every conceivable topic has its own sea of content, discovering what will truly captivate you is like finding a needle in an expansive haystack.

This challenge leads us to the marvels of recommendation services, acting as your compass in this digital expanse. These services are the unsung heroes behind the scenes of your favorite platforms, from e-commerce sites that suggest enticing products to streaming services that understand your movie preferences better than you might yourself. They sift through immense datasets of user interactions and content features, striving to tailor your online experience to be more personalized, engaging, and enriching.

But what if I told you that there is a cutting-edge technology that can take personalized recommendations to the next level? Today, I will take you through a journey to build a blog recommendation service that understands the contextual similarities between different pieces of content, transcending beyond basic keyword matching. We'll harness the power of vector search, a technology that's revolutionizing personalized recommendations. We'll explore how recommendation services are traditionally implemented, and then briefly discuss how vector search enhances them.

Finally, we'll put this knowledge to work, using OpenAI's embedding API and Elasticsearch to create a recommendation service that not only finds content but also understands and aligns it with your unique interests.

Exploring the Landscape: Traditional Recommendation Systems and Their Limits

Traditionally, these digital compasses, or recommendation systems, employ methods like collaborative and content-based filtering. Imagine sitting in a café where the barista suggests a coffee based on what others with similar tastes enjoyed (collaborative filtering) or based on your past coffee choices (content-based filtering). While these methods have been effective in many scenarios, they come with some limitations. They often stumble when faced with the vast and unstructured wilderness of web data, struggling to make sense of the diverse and ever-expanding content landscape. Additionally, when user preferences are ambiguous or when you want to recommend content by truly understanding it on a semantic level, traditional methods may fall short.

Enhancing Recommendation with Vector Search and Vector Databases

Our journey now takes an exciting turn with vector search and vector databases, the modern tools that help us navigate this unstructured data. These technologies transform our café into a futuristic spot where your coffee preference is understood on a deeper, more nuanced level.

Vector Search: The Art of Finding Similarities

Vector search operates like a seasoned traveler who understands the essence of every place visited. Text, images, or sounds can be transformed into numerical vectors, like unique coordinates on a map. The magic happens when these vectors are compared, revealing hidden similarities and connections, much like discovering that two seemingly different cities share a similar vibe.

Vector Databases: Navigating Complex Data Landscapes

Imagine a vast library of books where each book captures different aspects of a place along with its coordinates. Vector databases are akin to this library, designed to store and navigate these complex data points. They easily handle intricate queries over large datasets, making them perfect for our recommendation service, ensuring no blog worth reading remains undiscovered.

Embeddings: Semantic Representation

In our journey, embeddings are akin to a skilled artist who captures not just the visuals but the soul of a landscape. They map items like words or entire documents into real-number vectors, encapsulating their deeper meaning. This helps in understanding and comparing different pieces of content on a semantic level, letting the recommendation service show you things that really match your interests.

Sample Project: Blog Recommendation Service

Project Overview

Now, let’s craft a simple blog recommendation service using OpenAI's embedding APIs and Elasticsearch as a vector database. The goal is to recommend blogs similar to the current one the user is reading, which can be shown in the read more or recommendation section.

Our blogs service will be responsible for indexing the blogs, finding similar one,  and interacting with the UI Service.

Tools and Setup

We will need the following tools to build our service:

  • OpenAI Account: We will be using OpenAI’s embedding API to generate the embeddings for our blog content. You will need an OpenAI account to use the APIs. Once you have created an account, please create an API key and store it in a secure location.
  • Elasticsearch: A popular database renowned for its full-text search capabilities, which can also be used as a vector database, adept at storing and querying complex embeddings with its dense_vector field type.
  • Docker: A tool that allows developers to package their applications and all the necessary dependencies into containers, ensuring that the application runs smoothly and consistently across different computing environments.
  • Python: A versatile programming language for developers across diverse fields, from web development to data science.

The APIs will be created using the FastAPI framework, but you can choose any framework.

Steps

First, we'll create a BlogItem class to represent each blog. It has only three fields, which will be enough for this demonstration, but real-world entities would have more details to accommodate a wider range of properties and functionalities.

CODE: https://gist.github.com/velotiotech/b50e70f80381f8e3b994ad4ee854c5ae.js

Elasticsearch Setup:

  • To store the blog data along with its embedding in Elasticsearch, we need to set up a local Elasticsearch cluster and then create an index for our blogs. You can also use a cloud-based version if you have already procured one for personal use.
  • Install Docker or Docker Desktop on your machine and create Elasticsearch and Kibana docker containers using the below docker compose file. Run the following command to create and start the services in the background:
  • docker compose -f /path/to/your/docker-compose/file up -d. 
  • You can exclude the file path if you are in the same directory as your docker-compose.yml file. The advantage of using docker compose is that it allows you to clean up these resources with just one command.
  • docker compose -f /path/to/your/docker-compose/file down.

CODE: https://gist.github.com/velotiotech/31fac5f0e78ed297da88daa8447d6622.js

  • Connect to the local ES instance and create an index. Our “blogs” index will have a unique blog ID, blog title, blog content, and an embedding field to store the vector representation of blog content. The text-embedding-ada-002 model we have used here produces vectors with 1536 dimensions; hence, it’s important to use the same in our embeddings field in the blogs index.

CODE: https://gist.github.com/velotiotech/2248079c895cb621512b0a151fa94b97.js

Create Embeddings AND Index Blogs:

  • We use OpenAI's Embedding API to get a vector representation of our blog title and content. I am using the 002 model here, which is recommended by Open AI for most use cases. The input to the text-embedding-ada-002 should not exceed 8291 tokens (1000 tokens are roughly equal to 750 words) and cannot be empty.

CODE: https://gist.github.com/velotiotech/1a836a20eb8b8277f84c45bd678058a5.js

  • When the blogs get created or the content of the blog gets updated, we will call the create_embeddings function to get text embedding and store it in our blogs index.

CODE: https://gist.github.com/velotiotech/e33b688678157db170624979bd32f05e.js

  • Create a Pydantic model for the request body:

CODE: https://gist.github.com/velotiotech/a17f9d11d0c7b57fd7596e986ed36f35.js

  • Create an API to save blogs to Elasticsearch. The UI Service would call this API when a new blog post gets created.

CODE: https://gist.github.com/velotiotech/3645558a664d3e2c019ef0eebdf56d8e.js

Finding Relevant Blogs:

  • To find blogs that are similar to the current one, we will compare the current blog’s vector representation with other blogs present in the Elasticsearch index using the cosine similarity function.
  • Cosine similarity is a mathematical measure used to determine the cosine of the angle between two vectors in a multi-dimensional space, often employed to assess the similarity between two documents or data points. 
  • The cosine similarity score ranges from -1 to 1. As the cosine similarity score increases from -1 to 1, it indicates an increasing degree of similarity between the vectors. Higher values represent greater similarity.
  • Create a custom exception to handle a scenario when a blog for a given ID is not present in Elasticsearch.

CODE: https://gist.github.com/velotiotech/dc2f7ddb83085ef926d818aad805c174.js

  • First, we will check if the current blog is present in the blogs index and get its embedding. This is done to prevent unnecessary calls to Open AI APIs as it consumes tokens. Then, we would construct an Elasticsearch dsl query to find the nearest neighbors and return their blog content.

CODE: https://gist.github.com/velotiotech/25d0ad2726e1688e6af57f672b19db89.js

  • Define a Pydantic model for the response:

CODE: https://gist.github.com/velotiotech/28dc7074b446c432a47f8b88cae81529.js

  • Create an API that would be used by UI Service to find similar blogs as the current one user is reading:

CODE: https://gist.github.com/velotiotech/1d2cb865193484a872528840c0c834d8.js

  • The below flow diagram summarizes all the steps we have discussed so far:

Testing the Recommendation Service

  • Ideally, we would be receiving the blog ID from the UI Service and passing the recommendations back, but for illustration purposes, we’ll be calling the recommend blogs API with some test inputs from my test dataset. The blogs in this sample dataset have concise titles and content, which are sufficient for testing purposes, but real-world blogs will be much more detailed and have a significant amount of data. The test dataset has around 1000 blogs on various categories like healthcare, tech, travel, entertainment, and so on.
  • A sample from the test dataset:


  • Test Result 1: Medical Research Blog

    Input Blog: Blog_Id: 1, Title: Breakthrough in Heart Disease Treatment, Content: Researchers have developed a new treatment for heart disease that promises to be more effective and less invasive. This breakthrough could save millions of lives every year.


  • Test Result 2: Travel Blog

    Input Blog: Blog_Id: 4, Title: Travel Tips for Sustainable Tourism, Content: How to travel responsibly and sustainably.

I manually tested multiple blogs from the test dataset of 1,000 blogs, representing distinct topics and content, and assessed the quality and relevance of the recommendations. The recommended blogs had scores in the range of 87% to 95%, and upon examination, the blogs often appeared very similar in content and style.

Based on the test results, it's evident that utilizing vector search enables us to effectively recommend blogs to users that are semantically similar. This approach ensures that the recommendations are contextually relevant, even when the blogs don't share identical keywords, enhancing the user's experience by connecting them with content that aligns more closely with their interests and search intent.

Limitations

This approach for finding similar blogs is good enough for our simple recommendation service, but it might have certain limitations in real-world applications.

  • Our similarity search returns the nearest k neighbors as recommendations, but there might be scenarios where no similar blog might exist or the neighbors might have significant score differences. To deal with this, you can set a threshold to filter out recommendations below a certain score. Experiment with different threshold values and observe their impact on recommendation quality. 
  • If your use case involves a small dataset and the relationships between user preferences and item features are straightforward and well-defined, traditional methods like content-based or collaborative filtering might be more efficient and effective than vector search.

Further Improvements

  • Using LLM for Content Validation: Implement a verification step using large language models (LLMs) to assess the relevance and validity of recommended content. This approach can ensure that the suggestions are not only similar in context but also meaningful and appropriate for your audience.
  • Metadata-based Embeddings: Instead of generating embeddings from the entire blog content, utilize LLMs to extract key metadata such as themes, intent, tone, or key points. Create embeddings based on this extracted metadata, which can lead to more efficient and targeted recommendations, focusing on the core essence of the content rather than its entirety.

Conclusion

Our journey concludes here, but yours is just beginning. Armed with the knowledge of vector search, vector databases, and embeddings, you're now ready to build a recommendation service that doesn't just guide users to content but connects them to the stories, insights, and experiences they seek. It's not just about building a service; it's about enriching the digital exploration experience, one recommendation at a time.

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

Explore current openings