I’m working with Google’s DiscoveryEngine API for website search functionality. After upgrading my data store to advanced search mode to improve keyword accuracy, I noticed that the API response is now missing important image metadata that was previously available.
Current Response (Advanced Mode):
The response I receive now lacks the pagemap object that contains crucial image URLs and metadata.
Previous Response (Basic Mode):
The previous responses used to include a complete pagemap structure with important details like metatags, cse_image, cse_thumbnail, and thumbnail containing image URLs.
Can anyone guide me on how to configure the advanced search to return these missing image metadata items? I’m particularly looking for access to the thumbnail and image URLs that were part of the basic search mode.
google changed document fields for basic and advanced modes. make sure to add summarySpec with includeCitations true in contentSearchSpec. but really, reconfiguring extractiveAnswerSpec in your data store settings is the key - not just the search request. I had to redo my whole indexing setup too.
Had the same issue when I switched to advanced search last year. Advanced mode uses different indexing that doesn’t keep the pagemap metadata from your old Custom Search Engine setup. I had to manually set up schema mapping during data store setup so image fields would index properly. Check that your site’s structured data has the right image metadata tags (like og:image) and make sure your data connector pulls these fields. Re-indexing after I updated the schema config brought back most of my missing image metadata. Also double-check your serving config - the default advanced setup sometimes doesn’t grab all the field mappings from your old basic config.
Honestly though, the real fix is your data ingestion setup. I hit this same problem and had to go back and configure document metadata extraction properly. You need to tell DiscoveryEngine which fields to extract as structured metadata during indexing.
If you’re using website connector, make sure your pages have proper Open Graph tags and schema markup for images. Then map these fields to searchable attributes in your data store config.
The pagemap stuff from CSE doesn’t carry over automatically. You’ll probably need to re-index your content after fixing the metadata mapping. Pain in the neck but that’s Google’s migration path.
Hit this exact issue 6 months ago. Advanced search treats metadata extraction separately from search indexing - basic mode bundles pagemap data automatically, but advanced doesn’t. Your payload won’t pull image metadata because you’re missing document metadata specs. Add a userInfo section and configure boostSpec to prioritize docs with image data. More importantly, check your data source config includes image field mapping. Here’s the annoying part - Google doesn’t auto-migrate your pagemap structure when you upgrade. I had to manually set up custom attributes in data store settings for image URLs, then update my site’s metadata with consistent property names. After that, triggered a full re-crawl through console. First though, check if your indexed docs actually have the image metadata - look at sample results in DiscoveryEngine console. If metadata’s missing at document level, no API config will fix it. Usually it’s your data connector settings, not the search request.
Been dealing with this exact headache for months across multiple projects. Google’s advanced mode migration completely wipes your image metadata pipeline.
Everyone’s talking about schema mapping and re-indexing, but that’s just patching Google’s messy transition. You’re rebuilding your entire metadata extraction setup from scratch.
Hit this wall with three client projects and ended up automating the whole process with Latenode. Instead of fighting Google’s inconsistent API responses, I built a workflow that monitors search results and pulls missing image data from the actual pages.
The workflow grabs DiscoveryEngine results, finds missing image metadata, then scrapes source URLs for Open Graph images, meta tags, and structured data. Normalizes everything into consistent format and caches it so your frontend always gets complete image data.
Works way better than hoping Google’s indexing catches your image metadata. Plus you can customize which image fields you want and how they’re formatted.
Runs in the background so users never see incomplete results. Takes about 30 minutes to set up.