Finds the exact closest data points to a query by calculating distances to all vectors in the dataset.
Sacrifices perfect accuracy for speed by using efficient data structures to approximate nearest neighbors.
Uses meaning of content rather than keywords by searching through dense embedding vectors that capture semantic relationships.
Uses high-dimensional sparse vectors where most elements are zero, optimized for keyword and token matching.
Feature | Exact NN (ENN) | Approximate NN (ANN) | Semantic Search | Sparse Vector Search |
---|---|---|---|---|
Accuracy | 100% exact | High (95-99%) | Context dependent | High for exact matches |
Speed | Slow (O(n)) | Fast (sub-linear) | Moderate to fast | Very fast for keywords |
Scalability | Poor | Good | Good with ANN | Excellent |
Vector Type | Dense or Sparse | Usually Dense | Dense | Sparse |
Use Cases | Small datasets, high precision required | Large-scale vector search, recommenders | NLP, content discovery, similar item search | Search engines, document retrieval |
Common Metrics | Euclidean, Manhattan, Cosine | Euclidean, Inner Product, Cosine | Cosine, Dot Product | Jaccard, BM25, TF-IDF |
Dimensions | Any | Moderate to high | High (768-1536 typical) | Very high (vocabulary size) |
Example Tools | SciPy, NumPy | FAISS, Annoy, HNSW | Pinecone, Weaviate, Milvus | Elasticsearch, Lucene |