Hey everyone, I’m trying to figure out how to use IMPORTXML in Google Sheets to grab info from a sports results website. I want to get the dates, team names, and detailed scores for basketball games. The tricky part is that some of the data is on the main page, but the full scores are on separate match pages.
I’ve been playing around with this formula:
=IMPORTXML("https://example.com/results", "//div[@class='match-info']")
But it’s not getting everything I need. Can anyone help me create a formula that can:
- Scrape the basic info from the main results page
- Follow links to individual match pages
- Extract the extended scores (like quarter-by-quarter results)
I’m aiming to have all this data in one sheet, with each row showing a complete match record. Any tips or sample formulas would be super helpful! Thanks in advance.
IMPORTXML has limitations for complex scraping tasks like this. You’re better off using Google Apps Script to create a custom function. It can handle multiple pages and parse nested data more efficiently.
Here’s a basic approach:
- Write a script to fetch the main page and extract match URLs.
- Loop through each URL, retrieving detailed scores.
- Compile all data into a 2D array.
- Return this array to populate your sheet.
This method allows for more control and can handle pagination if needed. It’s more robust for large datasets and can be scheduled to run automatically.
Consider using libraries like Cheerio for easier HTML parsing within your script.
hey sophialee92, i’ve messed with IMPORTXML before. it’s tricky for what u want. You’ll prob need multiple formulas + some scripting. try this for main page:
=IMPORTXML(“url”, “//div[@class=‘match-info’]//text()”)
for individual pages, you’ll need separate formulas or custom functions. good luck!
I’ve tackled similar challenges with IMPORTXML before. While it’s possible to get basic info from the main page, extracting detailed scores from individual match pages is trickier. Here’s what I’d suggest:
For the main page data:
=IMPORTXML(“https://example.com/results”, “//div[@class=‘match-info’]//text()”)
This should grab dates and team names. For the detailed scores, you’ll need to use Google Apps Script. Create a custom function that:
- Fetches the main page
- Extracts links to individual match pages
- Visits each link and scrapes the detailed scores
- Compiles everything into a single array
It’s more complex, but it’ll give you the flexibility to grab all the data you need. Plus, you can schedule it to run automatically, keeping your sheet up-to-date without manual intervention.
If you’re new to Apps Script, there are plenty of tutorials online to get you started. Good luck with your project!