Displaying markdown images in OpenAI-powered chat application

English version:

I’m building a chat application with React and TypeScript that uses OpenAI’s API. I’m also using Chroma for the database part. I want to add support for images that are mentioned in markdown files. My plan is to add image links in the documents like this:

![Picture info](https://example-image-url.com "Search terms")

When someone asks about these images, the bot should:

  • Give an answer based on what’s in the markdown file
  • If the question is about a particular image, get the image URL and show it in the response

What I need to know:

  • Should I parse the markdown every time someone asks something, or should I save the image info in Chroma first with their related content?
  • What’s the best way to set this up so it works fast and gives good results?

I would love to hear any suggestions or if anyone has done something similar before.

I’m looking for the most effective approach to handle this feature.

Been working on a similar setup recently and learned this the hard way. You want to extract image metadata during your initial document ingestion phase, not at query time. What I did was create a preprocessing pipeline that walks through all markdown files, extracts image references using regex patterns, and stores them as structured metadata alongside the document chunks in Chroma. The key insight is treating images as first-class searchable entities rather than just embedded content. When indexing, I concatenate the image alt text and title with the surrounding paragraph text to create richer context for the vector embeddings. At runtime, your OpenAI responses can then reference specific images by URL when the semantic search returns chunks containing image metadata. This approach scales much better than real-time parsing and gives you more control over how images are discovered and presented.

for sure, keep the img metadata in chroma, it’ll save you a lot of time. went through this too, parsing every time just slows stuff down. extract those urls and alt texts from the markdown, put em with your content ahead of time. way better for quick responses!

I handled something similar in my document search system last year. Definitely preprocess and store the image metadata in Chroma rather than parsing on each request. What worked well for me was creating a separate embedding for each image reference that includes the alt text, title attribute, and surrounding context paragraphs. This way your vector search can actually match user queries to relevant images directly. For the response generation, I pass both the matched text content and any associated image URLs to the OpenAI API as context. The performance difference is significant when you have hundreds of markdown files with embedded images.