Aerospike Vector Search Frequently Asked Questions
This page contains answers to some of the most frequently asked questions about Aerospike Vector Search (AVS).
What is a vector (embedding?)
A vector embedding is a numerical representation of data, such as words or images, that captures essential features and relationships in a high-dimensional space. This allows algorithms to process, compare, and analyze the data more effectively for tasks like classification, clustering, and similarity search. See our guide to generating vector embeddings for details about how to create vector embeddings.
What does a vector database do?
A vector database provides two basic functions: storage and search. Storage is straightforward and a specialized database is not required (you could store your data in files, for example). Since you can calculate the distance between two vectors, a vector database provides a unique way to perform a proximity search across vectors that are loaded in the vector space.
What do KNN and ANN stand for?
KNN stands for K-nearest neighbor, referring to a collection of search algorithms used to determine the proximity between vectors. ANN stands for approximate nearest neighbor, which is a type of KNN search that prioritizes speed over quality.
What is HNSW?
HNSW stands for Hierarchical Navigable Small World, which is the algorithm used by AVS to perform an ANN search.
What is RAG?
RAG stands for Retrieval Augmented Generation, which is a technique for using a knowledge repository to provide context to generative language models. RAG is a common application pattern used with vector databases.
What version of Aerospike is required for AVS?
AVS has been tested and developed using Aerospike Database version 7.x. While AVS does not rely on any specific features in 7.x, earlier versions have not been tested and are not supported.
Can I add AVS to data already in Aerospike?
Yes. You need to add a vector to your data to add AVS search functionality to Aerospike. You must configure a metadata namespace and a namespace for storing your index. See Configure Aerospike Database for AVS for more information.
How are vectors stored in Aerospike?
Vectors are stored in Aerospike as an additional bin on the record. This bin is encoded with details about the dimensions of your vector and should not be edited directly. See Vector data in Aerospike for more information.
Why do my vectors not look like vector data?
Vector data is encoded with details about the dimensions of your vector and should not be edited directly. See more details about the vector data model.
How should I configure Aerospike for AVS?
Your goals will affect how you scale Aerospike for AVS, but there are some general recommendations.
- Strong consistency in not needed in the namespace you are using for search.
- You can configure Hybrid Memory Architecture (HMA) to optimize for cost with minimal impact on performance. See Planning a deployment for more information about configuring storage.
Can I use Aerospike tools like asbackup, asadm, and others?
Yes, Aerospike administrators can use Aerospike tools to configure an Aerospike cluster, perform basic monitoring, and perform backups of data and indexes.
Does AVS use GPU acceleration?
AVS does not use GPU acceleration. Index construction takes advantage of advanced vector extensions (AVX), which allows for single instruction, multiple data parallel processing. You can also speed up indexing by setting up standalone indexing or by scaling up for distributed index construction.
What happens if standalone indexing exceeds the memory of the node?
We recommend that you do not share the standalone indexer node role because it is possible that the process will run out of memory, triggering an Out of Memory (OOM) exception. You can fix this by changing your index to distributed mode. See more details in our index troubleshooting guide.
How do I rebuild an index?
The best way to rebuild an index with a large number of changes is to use standalone indexing to create a new index on and switch over to. See Index Management for more details about managing the index mode.