I’m working with Qdrant vector database through n8n and having trouble with duplicate entries. The Qdrant docs mention that when you insert data using the same ID, it should replace the existing point instead of creating duplicates. However, I can’t figure out how to properly assign point IDs in the n8n interface. I tried putting an id field in the metadata section, but that doesn’t seem to be the actual point identifier that Qdrant uses for deduplication. Every time I run my workflow with the same document, it creates a new entry instead of updating the existing one. How can I configure the point ID properly in n8n to prevent these duplicates?
totally! that part tripped me up too. you need to make sure the point ID is set properly in the Qdrant node, not in the metadata. i usually just hash the content or use a timestamp if that helps. keeping it consistent does the trick!
Duplicates arise in n8n due to the absence of consistent point IDs for Qdrant. If IDs aren’t specified, Qdrant generates random UUIDs with each workflow execution, resulting in duplicates. To resolve this, insert a Function node prior to the Qdrant node to create a hash from your document’s content. This can be achieved by combining the title and content or by utilizing a unique field from your database. The key is to ensure the ID is deterministic, enabling the same document to always produce the same ID. I prefer using MD5 hashing for its speed and consistency. With an existing ID, Qdrant replaces the entry instead of duplicating it.
Had this exact problem a few months ago - drove me nuts for days! The n8n Qdrant node interface is misleading about how IDs work. Don’t put the point ID in the metadata section. Set it at the root level of your data structure instead. In the Qdrant node config, there’s a separate “Point ID” field - it’s not with the payload/metadata stuff. You can use a static value or grab a field from your input data with {{ $json.your_id_field }}. The trick is keeping this ID consistent across runs for the same document. I usually hash the document content or pull a unique identifier from my source data. Once I got this right, deduplication worked perfectly and subsequent runs updated existing points instead of creating duplicates.