Redis vector database query fails with HuggingFace embedding models in langchain

I’m working on building a chatbot system that uses langchain with Redis as the vector database. Everything works fine when I use OpenAI embeddings, but I run into problems when switching to HuggingFace models.

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores.redis import Redis

# This works fine
embedding_model = OpenAIEmbeddings()

# Sample data
user_data = [
    {
        "name": "alice",
        "years": 25,
        "role": "developer",
        "rating": "excellent",
    },
    {
        "name": "bob",
        "years": 52,
        "role": "teacher",
        "rating": "poor",
    },
    {
        "name": "carol",
        "years": 67,
        "role": "teacher",
        "rating": "excellent",
    }
]
documents = ["hello", "world", "test"]

vector_db = Redis.from_texts(
    documents,
    embedding_model,
    metadatas=user_data,
    redis_url="redis://localhost:6379",
    index_name="my_index",
)

# This query works
query_results = vector_db.similarity_search("hello")

But when I change to HuggingFace embeddings like this:

embedding_model = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    model_kwargs={'device': 'cpu'}
)

I get this error when trying to search:

ResponseError: Error parsing vector similarity query: query vector blob size (1536) does not match index's expected size (384).

It seems like there’s a mismatch in vector dimensions between what was stored and what’s being queried. Has anyone encountered this before? What’s the best way to handle this?

Been there, done that. The dimension mismatch is annoying but totally fixable.

Manually managing Redis indexes and switching between models gets messy fast though. You’ll hit this every time you test different embedding models.

I automated the whole pipeline to fix this. Built a workflow that detects embedding model dimensions, drops the old index if needed, and recreates everything with the right schema. No more manual Redis commands or tracking different index names.

It handles dimension checking, index recreation, and batch processing your documents. Works with OpenAI, HuggingFace, or any embedding provider. You can switch between models for A/B testing without breaking anything.

Saved me hours of debugging and made chatbot development way smoother. Everything runs in the background so you just focus on building features.

Indeed, you are facing a vector dimension mismatch issue. I encountered a similar problem while transitioning from OpenAI to HuggingFace embeddings. Redis constructs its schema based on the initial embedding, requiring subsequent vectors to have the same dimensionality. The best solution is to clear the existing index by using FT.DROPINDEX my_index in the Redis CLI before attempting to switch models. Alternatively, you can catch the ResponseError in your code and recreate the vector store. Running separate indexes for different models could also work, allowing you to evaluate performance without data loss. However, switching embedding models tends to disrupt semantic consistency with older vectors, so starting from scratch is often advisable.

You’re encountering a dimension mismatch because OpenAI embeddings have 1536 dimensions, while the sentence-transformers model outputs 384 dimensions. Redis has locked your index to the dimensions of the initial setup with OpenAI. To resolve this, you will need to delete the existing index and recreate it for the HuggingFace model. I faced a similar issue while switching between embedding models recently. A couple of quick solutions include flushing your Redis database or simply using a different index name when you change models. Also, always check the model card on HuggingFace, as it indicates output dimensions, since different sentence-transformers models have varying vector sizes.

Hit this exact problem last month migrating our document search system. The vector dimension issue is just the start.

What really caught me off guard was performance. OpenAI embeddings are way faster since they’re API calls, while HuggingFace models run locally and can bottleneck your system with lots of queries.

For the immediate fix, yeah, drop the index or use a new name. But here’s what I learned the hard way - different embedding models have completely different semantic spaces. Your similarity scores and search results will be totally different even for identical queries.

I kept both models running in parallel for a few weeks to compare results before fully switching. Used different Redis databases (not just index names) to keep things clean.

Also check your memory usage. That sentence transformer model loads into RAM - can be 400MB+ depending on the variant. We had to bump our container limits.

One more thing - if you’re planning to switch models regularly for testing, consider Pinecone or Weaviate. Redis works great but wasn’t built specifically for this.

Totally get what you’re saying! Had the same issue. Redis locks the index to your first model’s dimensions, so switching from OpenAI’s 1536 to Hugging Face’s 384 breaks things. Just use a new index name like ‘my_index_hf’ - works perfectly and keeps your old data safe!

Had this exact problem building my recommendation engine. Redis locks the vector field schema to your first embedding model - so once it’s set to 1536 dimensions for OpenAI, it won’t take 384-dimensional vectors from HuggingFace. I fixed it by adding a dimension check before creating the vector store. Query the existing index schema with Redis commands, compare it to your current model’s dimensions, and recreate the index if they don’t match. Watch out for HuggingFace model configs though. Different sentence-transformer models have different dimensions, and even the same model changes if you tweak the pooling method. Always test with a sample embedding before processing large document sets. Switching embedding models kills your existing semantic relationships too. Just treat it like starting over instead of trying to migrate vectors.