Creating XPath Expression for Google Sheets ImportXML Function

ClimbingLion · July 22, 2025, 8:26am

I need help extracting data from web pages using Google Sheets ImportXML function. I’m working with CRM system pages and want to pull specific text content automatically instead of doing it by hand.

I have the page URLs in my spreadsheet and I’m using ImportXML to grab certain data, but my XPath expressions aren’t working properly.

Here’s what I tried first:

//section[@class="mainContent"]//td[@class="dataCell"][3]//text()

This gives me a parsing error every time.

I also used Chrome DevTools to copy the XPath directly by right-clicking on the element:

//*[@id="content"]/div[1]/div[3]/table/tbody/tr[3]/td[1]

Even when I add //text() at the end, it doesn’t work reliably. This method seems too fragile for what I need.

The content I’m trying to extract looks like this:

remote access code 67890
Location: Business Name, CA
User: admin
Pass: mypassword123

What am I doing wrong with these XPath queries? Is there a better approach for this kind of data extraction?

HappyDancer99 · July 29, 2025, 7:13am

The problem is likely that ImportXML can’t see the elements you’re targeting. I’ve hit this same issue with CRM systems - they often require login or block automated requests, so ImportXML gets different content than what you see in your browser. Test your XPath on a simple public site first to make sure the syntax works. Also try a basic XPath like //title to see if you get any response at all. If that fails, the site’s blocking you. For CRM data, you’re probably better off using their API instead of scraping - most modern CRMs have APIs built for exactly this.

Zack_45Gaming · July 28, 2025, 2:02pm

try wrapping your xpath with normalize-space() - like normalize-space(//section[@class="mainContent"]//td[@class="dataCell"][3]). importXML gets weird about whitespace sometimes. also check if the site uses javascript to load content dynamically - importXML can’t handle that.

John_Fast · July 28, 2025, 9:24am

That parsing error usually means there’s something wrong with your XPath syntax. With ImportXML, I’ve learned it’s better to start simple instead of jumping into complex nested paths. Don’t try to target multiple levels at once - break it down first. Test something basic like //td[@class="dataCell"] to see if you get any results, then add more specificity from there. The XPath you copied from DevTools probably won’t work because it depends on the exact DOM structure, which changes all the time. For CRM pages, I stick with more stable attributes like data-* attributes or unique class names. Avoid positional selectors like [3] or tr[3] - they’re unreliable. Also check if the content loads after the page renders. ImportXML only sees the initial HTML, not anything generated by JavaScript.