Vector Databases

Misc

  • Vector databases store embeddings and provide fast similarity searches

  • Superlinked Comparison (link): Prices, features, etc.

  • Comparison (link)

    • Open-Source and hosted cloud: If you lean towards open-source solutions, Weviate, Milvus, and Chroma emerge as top contenders. Pinecone, although not open-source, shines with its developer experience and a robust fully hosted solution.
    • Performance: When it comes to raw performance in queries per second, Milvus takes the lead, closely followed by Weviate and Qdrant. However, in terms of latency, Pinecone and Milvus both offer impressive sub-2ms results. If nmultiple pods are added for pinecone, then much higher QPS can be reached.
    • Community Strength: Milvus boasts the largest community presence, followed by Weviate and Elasticsearch. A strong community often translates to better support, enhancements, and bug fixes.
    • Scalability, advanced features and security: Role-based access control, a feature crucial for many enterprise applications, is found in Pinecone, Milvus, and Elasticsearch. On the scaling front, dynamic segment placement is offered by Milvus and Chroma, making them suitable for ever-evolving datasets. If you’re in need of a database with a wide array of index types, Milvus’ support for 11 different types is unmatched. While hybrid search is well-supported across the board, Elasticsearch does fall short in terms of disk index support.
    • Pricing: For startups or projects on a budget, Qdrant’s estimated $9 pricing for 50k vectors is hard to beat. On the other end of the spectrum, for larger projects requiring high performance, Pinecone and Milvus offer competitive pricing tiers.

Brands

  • Qdrant - open source, free, and easy to use (example)
  • Chroma - can be used as a local in-memory (example)
  • Pinecone - Data Elixir is using this store for their chatbot; has a free tier
  • Postgres with pgvector: Supports exact and approximate nearest neighbor search; L2 distance, inner product, and cosine distance; any language with a Postgres client