Large language models such as GPT-4 can be combined with your data with ease using Vector Search and embeddings.
I've recently released a course that teaches you how to use Vector Search on three distinct projects on the freeCodeCamp.org YouTube channel.
You will first be introduced to the concepts, and I will work with you to develop three projects.
In the first project, we develop a semantic search function that uses natural language queries to locate movies. We use Atlas Vector Search, Python, and machine learning models for this.
Next, we develop a basic question-answering application that leverages your own data to provide answers by utilizing the RAG architecture and Atlas Vector Search.
In the final project, we also make changes to a ChatGPT clone so that it responds to inquiries about adding to the curriculum of freeCodeCamp.org by using the official documentation. You can easily adapt this project to use your own documentation or data.
Through a grant from MongoDB, this course was made possible. You can use their Atlas Vector Search to search for semantic similarity in your data, and you can combine it with LLMs to create AI-powered applications.
What are Vector Embeddings?
Consider that you wish to arrange a variety of objects, such as fruits, to highlight their similarities and differences. You may arrange them according to color, size, or preference in the real world. Vector embeddings are useful for doing similar tasks with data in the digital realm.
Vector embeddings can be thought of as a digital categorization or description system. Every element—a word, picture, or anything else you can think of—becomes a sequence of numbers. We refer to this list as a "vector". The neat thing is that vectors for related items will look similar.
We can use mathematics to interpret and manipulate items by converting them into vectors, which are lists of numbers. To determine how similar the objects two vectors represent are, for instance, we can calculate their distance from one another.
Words can be represented as vectors, and words with related meanings have close-by vectors. This is useful for jobs like information retrieval, language translation, and even AI conversation.
These embeddings typically require a significant amount of data and intricate math to create. The computer learns the optimal way to convert these examples—such as word usage in sentences—into vectors by examining a large number of them.
What is Vector Search?
The process of locating and retrieving data that is most pertinent to a given query is known as vector search. However, unlike conventional search engines, vector search makes an effort to comprehend the meaning or context of the query rather than just finding exact matches. Semantic search, which uses word meaning to find relevant results, can be implemented using vector search.
By converting the search query and database items (such as documents, photos, or products) into vectors and comparing them to identify the best matches, vector search makes use of vector embeddings.
Here's how the process works in detail:
- Turning Data into Vectors: First, everything needs to be converted into vectors. This is done using models that are trained to understand different types of data. For example, a text document is analyzed and turned into a vector that represents its content and meaning.
- Query Processing: When you make a search query, the same process is applied to turn your query into a vector. This vector represents what you're looking for.
- Calculating Similarity: The vector of your search query is then compared with the vectors of items in the database. This is typically done by calculating the distance between vectors. The most common method is using something called "cosine similarity," which measures the cosine of the angle between two vectors. If two vectors are very similar, the angle will be small, and the cosine similarity will be high.
- Ranking Results: Based on these similarity measurements, the system ranks the items in the database. The ones with vectors closest to the query vector are considered the most relevant and are presented as the top search results.
- Retrieving the Best Matches: Finally, the system retrieves and displays the items that best match the query, according to the similarity of their vectors.
Essentially, vector search makes use of vector embeddings to comprehend the context and content of the database items as well as the query. It effectively locates and ranks the most pertinent results by comparing these vectors, offering a strong tool for sifting through big, intricate datasets.
Retrieval-augmented generation (RAG)
Based on the input query, the retrieval-augmented generation (RAG) architecture retrieves pertinent documents using vector search. The LLM is then given the context of these documents that were retrieved, which aids in producing a more precise and knowledgeable response. That is, RAG uses those pertinent retrieved documents to help generate a more informed and accurate response, rather than relying solely on patterns learned during training. This aids in addressing LLM limitations. In particular:
- RAGs minimize hallucinations by grounding the model’s responses in factual information.
- By retrieving information from up-to-date sources, RAG ensures that the model’s responses reflect the most current and accurate information available.
- While RAG does not directly give LLMs access to a user’s local data, it does allow them to utilize external databases or knowledge bases, which can be updated with user-specific information.
- Also, while RAG does not increase an LLM’s token limit, it does make the model’s use of tokens more efficient by retrieving only the most relevant documents for generating a response.
Atlas Vector Search
Semantic similarity searches can be carried out on your data using MongoDB Atlas Vector Search, and these searches can be combined with LLMs to create AI-powered applications. Vector embeddings are a numerical representation of data that can come from different sources and in different formats.
By utilizing the document model, Atlas Vector Search enables you to store vector embeddings with your source data and metadata. Afterwards, these vector embeddings can be queried with an approximate nearest neighbors algorithm through an aggregation pipeline to quickly search for semantic similarity in the data.
You will learn how to use Atlas Vector Search in your applications in this course.
Watch the full course on the freeCodeCamp.org YouTube channel (1 hour watch).