Vector Databases

Misc

Vector databases store embeddings and provide fast similarity searches
Packages
- {rchroma} - A clean interface to ChromaDB, a modern vector database for storing and querying embeddings
Superlinked Comparison (link): Prices, features, etc.
Comparison (link)
- Open-Source and hosted cloud: If you lean towards open-source solutions, Weviate, Milvus, and Chroma emerge as top contenders. Pinecone, although not open-source, shines with its developer experience and a robust fully hosted solution.
- Performance: When it comes to raw performance in queries per second, Milvus takes the lead, closely followed by Weviate and Qdrant. However, in terms of latency, Pinecone and Milvus both offer impressive sub-2ms results. If nmultiple pods are added for pinecone, then much higher QPS can be reached.
- Community Strength: Milvus boasts the largest community presence, followed by Weviate and Elasticsearch. A strong community often translates to better support, enhancements, and bug fixes.
- Scalability, advanced features and security: Role-based access control, a feature crucial for many enterprise applications, is found in Pinecone, Milvus, and Elasticsearch. On the scaling front, dynamic segment placement is offered by Milvus and Chroma, making them suitable for ever-evolving datasets. If you’re in need of a database with a wide array of index types, Milvus’ support for 11 different types is unmatched. While hybrid search is well-supported across the board, Elasticsearch does fall short in terms of disk index support.
- Pricing: For startups or projects on a budget, Qdrant’s estimated $9 pricing for 50k vectors is hard to beat. On the other end of the spectrum, for larger projects requiring high performance, Pinecone and Milvus offer competitive pricing tiers.

Brands

Qdrant - open source, free, and easy to use (example)
Chroma - Can be used as a local in-memory (example)
Pinecone - Data Elixir is using this store for their chatbot; has a free tier
Postgres with pgvector: Supports exact and approximate nearest neighbor search; L2 distance, inner product, and cosine distance; any language with a Postgres client
- Also see Databases, PostgreSQL >> Extensions >> pgvector and pg_sparse for sparse embeddings (e.g. SPLADE)